<feed xmlns="http://www.w3.org/2005/Atom">
  <generator uri="https://jekyllrb.com/" version="3.9.0">Jekyll</generator>
  <link href="https://jktauber.com/feed/all/atom/index.xml" rel="self" type="application/atom+xml"/>
  <link href="https://jktauber.com/" rel="alternate" type="text/html"/>
  <updated>2020-10-21T07:42:43+00:00</updated>
  <id>https://jktauber.com/feed/all/atom/index.xml</id>
  <title type="html">J. K. Tauber</title>
  <subtitle>at the intersection of computing, linguistics, philology, and learning science</subtitle>
  <author>
    <name>James Tauber</name>
  </author><entry>
    <title type="html">Working on Plato Texts</title>
    <link href="https://jktauber.com/2020/10/21/working-on-plato-texts/" rel="alternate" type="text/html" title="Working on Plato Texts"/>
    <published>2020-10-21T14:35:00+08:00</published>
    <updated>2020-10-21T14:35:00+08:00</updated>
    <id>https://jktauber.com/2020/10/21/working-on-plato-texts</id>
    <content type="html" xml:base="https://jktauber.com/2020/10/21/working-on-plato-texts/">&lt;p&gt;I started working on some Plato texts a while ago but now I&#39;m back to it, integrating various information and hitting some more issues with the Diorisis corpus.&lt;/p&gt;
&lt;p&gt;About a year ago, I wrote about some &lt;a href=&#34;https://jktauber.com/2019/11/05/subcorpus-vocabulary-statistics/&#34;&gt;vocabulary statistics&lt;/a&gt; I&#39;d put together around various texts. This included a subcorpus of Plato I wanted to put together for the &lt;a href=&#34;https://greek-learner-texts.org&#34;&gt;Greek Learner Texts Project&lt;/a&gt;. Based on the &#34;core&#34; works list I&#39;d put together for &lt;a href=&#34;https://vocab.perseus.org&#34;&gt;https://vocab.perseus.org&lt;/a&gt;, this included:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Euthyphro&lt;/li&gt;
&lt;li&gt;Apology&lt;/li&gt;
&lt;li&gt;Crito&lt;/li&gt;
&lt;li&gt;Symposium&lt;/li&gt;
&lt;li&gt;Republic&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At the time, I used Giuseppe Celano&#39;s experimental lemmatization as the basis for the vocabulary counts.&lt;/p&gt;
&lt;p&gt;In the intervening period, I went further with Crito and restructured the citation scheme to be based on units of dialogue and sentences. You can see HTML generated from the result at:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://jtauber.github.io/plato-texts/&#34;&gt;https://jtauber.github.io/plato-texts/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;but the underlying data is available at &lt;a href=&#34;https://github.com/jtauber/plato-texts/blob/master/text/crito.txt&#34;&gt;https://github.com/jtauber/plato-texts/blob/master/text/crito.txt&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I also took a first pass at aligning an English translation at the sentence level and the raw data for that is available at &lt;a href=&#34;https://github.com/jtauber/plato-texts/blob/master/analysis/crito_aligned.txt&#34;&gt;https://github.com/jtauber/plato-texts/blob/master/analysis/crito_aligned.txt&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;My plan was always to return to the other four texts and last weekend I started on that, freshly bringing in the texts (in both Greek and English) from Perseus, the Diorisis corpus tagging, and the treebanks from AGLDT (Euthyphro) and Vanessa Gorman (the Apology).&lt;/p&gt;
&lt;p&gt;I also added:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Phaedo&lt;/li&gt;
&lt;li&gt;Meno&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and may add others if they seem appropriate for the Greek Learner Texts Project.&lt;/p&gt;
&lt;p&gt;The first thing I did was produce a stripped-down tokenized version of the Greek texts from Perseus with minimal markdown. In this process, I found a small number of issues with the Perseus XML which I&#39;ll submit corrections for shortly (mostly some stray gammas).&lt;/p&gt;
&lt;p&gt;I then wrote a script to extract similar tokens from Diorisis for alignment. As I&#39;ve &lt;a href=&#34;https://jktauber.com/2020/01/20/working-with-the-diorisis-ancient-greek-corpus/&#34;&gt;written about before&lt;/a&gt;, the Diorisis corpus made the odd choice to use betacode for the tokens so I had to do a conversion. Then the real fun began.&lt;/p&gt;
&lt;p&gt;Firstly, the Perseus text, based on the Burnet edition, has various editorial markup like &lt;code&gt;&amp;lt;add&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;del&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;corr&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;sic&amp;gt;&lt;/code&gt;. I quickly discovered that the Diorisis text drops the &lt;code&gt;&amp;lt;del&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;sic&amp;gt;&lt;/code&gt; elements. That&#39;s fine although I might seek the advice of people more familiar with Burnet and the text scholarship of Plato as to what the Greek Learner Texts edition should do.&lt;/p&gt;
&lt;p&gt;Secondly, in Phaedo at least, named entities are marked up in the Perseus TEI XML. People and places are all appropriately tagged. I don&#39;t happen to need that right now although it&#39;s potentially useful information. But the Diorisis corpus drops those elements. I don&#39;t just mean it drops the tags, it dropps the &lt;em&gt;elements&lt;/em&gt;. So if the sentence was &lt;code&gt;&amp;lt;persName&amp;gt;John&amp;lt;/persName&amp;gt; loves &amp;lt;persName&amp;gt;Mary&amp;lt;/persName&amp;gt;&lt;/code&gt;, Diorisis would just give the sentence as &lt;code&gt;loves&lt;/code&gt; (at least in Phaedo). Fairly easy to work around for alignment purposes, though.&lt;/p&gt;
&lt;p&gt;The more time consuming aspect is the odd way Diorisis handles quotations. It seems to repeat the tokens of each quotation, once in context and then once in a sentence of its own. Except sometimes the repetition is incorporated in an unrelated sentence.&lt;/p&gt;
&lt;p&gt;For example, the Homeric quotation in 408a (Republic Book 3) is analyzed inline but then also repeated in another sentence where it&#39;s part of the first sentence of 409a (&#34;δικαστὴς δέ γε...&#34;) which, unless I&#39;m missing something considerable, is just completely wrong.&lt;/p&gt;
&lt;p&gt;I&#39;m manually correcting all this (it comes up as an alignment mismatch and I&#39;m going in and editing the Diorisis XML to remove the duplication). But even without the bad sentence merges, this also means that the vocabulary counts I&#39;ve previously generated from Diorisis (and in &lt;code&gt;vocabulary-tools&lt;/code&gt;) may have doubled up on any words appearing in quotations.&lt;/p&gt;
&lt;p&gt;So there&#39;s lots more to do with Plato, not least of all the manual curation of lemmatization. But the goal, like that of the Greek Learner Texts Project as a whole, is to have a set of openly-licensed, high-quality, lemmatized texts for extensive reading by language learners.&lt;/p&gt;
&lt;p&gt;Collaboration always welcome. Just ping me on the &lt;a href=&#34;https://greek-learner-texts.org&#34;&gt;Greek Learner Texts Project&lt;/a&gt; Slack workspace.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I started working on some Plato texts a while ago but now I&#39;m back to it, integrating various information and hitting some more issues with the Diorisis corpus.</summary>
  </entry><entry>
    <title type="html">Ordering Vocabulary by Pericope Dispersion</title>
    <link href="https://jktauber.com/2020/10/04/ordering-vocabulary-by-pericope-dispersion/" rel="alternate" type="text/html" title="Ordering Vocabulary by Pericope Dispersion"/>
    <published>2020-10-04T22:35:00+08:00</published>
    <updated>2020-10-04T22:35:00+08:00</updated>
    <id>https://jktauber.com/2020/10/04/ordering-vocabulary-by-pericope-dispersion</id>
    <content type="html" xml:base="https://jktauber.com/2020/10/04/ordering-vocabulary-by-pericope-dispersion/">&lt;p&gt;Jesse Egbert&#39;s Plenary at JAECS 2020 is giving me a bunch of ideas of things to try on the New Testament and larger Greek corpora. In this post, I briefly explore text dispersion keyness using pericopes as a way of ordering vocabulary.&lt;/p&gt;
&lt;p&gt;Back in &lt;a href=&#34;{% post_url 2018-01-21-lexical-dispersion-greek-new-testament-gries-dp %}&#34;&gt;Lexical Dispersion in the Greek New Testament Via Gries’s DP&lt;/a&gt; I wrote:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;My sense is that dispersion might be a useful input to deciding what vocabulary to learn. For example διδαχή or σκότος might be better to learn before ἀρνίον because, even though they all have the same frequency, you are more likely to encounter διδαχή or σκότος in a random book or chapter.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Egbert&#39;s plenary (available &lt;a href=&#34;https://www.jaecs2020.org/plenary-talk-1/&#34;&gt;here&lt;/a&gt; after free signup) encouraged me to try a very simple metric instead of frequency: what proportion of text units in the corpus does the word appear in? Egbert emphasises using linguistically meaningful units of text (definitely not fixed-length windows) and pericopes seem perfect for this. There &lt;em&gt;are&lt;/em&gt; dispersion measures that allow for varying sizes of text unit (like Gries&#39;s DP) but it seemed to me that &lt;strong&gt;just seeing what proportion of pericopes the item appears in might be a good measure of the importance to learn&lt;/strong&gt; (instead of frequency).&lt;/p&gt;
&lt;p&gt;This downplays words that might get repeated a lot in just a handful of pericopes and favours those that appear in lots of pericopes even if only one or two times in that pericope. Intuitively this makes sense, A word that appears 10 times in one passage in the New Testament (and nowhere else) isn&#39;t as generally useful to learn as a word that appears once in ten different passages. Overall corpus frequency can therefore be misleading because it treats these two cases as the same.&lt;/p&gt;
&lt;p&gt;With &lt;code&gt;vocabulary-tools&lt;/code&gt; it was trivial to produce a list of all the New Testament lemmas sorted by pericope dispersion.&lt;/p&gt;
&lt;p&gt;This gist contains the code and the list:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://gist.github.com/jtauber/fc4b0476a4c4a94d7cb01d068161892e&#34;&gt;https://gist.github.com/jtauber/fc4b0476a4c4a94d7cb01d068161892e&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Eyeballing the resultant list, it seems a very promising ordering although I welcome comments on anything interesting people notice.&lt;/p&gt;
&lt;p&gt;Next steps are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;quantitative comparison with pure frequency&lt;/li&gt;
&lt;li&gt;application to other lemmatized Greek corpora with meaningful text units similar to pericopes&lt;/li&gt;
&lt;li&gt;try other meaningful text units I have for NT such as books or paragraphs or even sentences&lt;/li&gt;
&lt;/ol&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Jesse Egbert&#39;s Plenary at JAECS 2020 is giving me a bunch of ideas of things to try on the New Testament and larger Greek corpora. In this post, I briefly explore text dispersion keyness using pericopes as a way of ordering vocabulary.</summary>
  </entry><entry>
    <title type="html">More on Plato After GNT</title>
    <link href="https://jktauber.com/2020/09/14/more-on-plato-after-gnt/" rel="alternate" type="text/html" title="More on Plato After GNT"/>
    <published>2020-09-14T08:35:00+08:00</published>
    <updated>2020-09-14T08:35:00+08:00</updated>
    <id>https://jktauber.com/2020/09/14/more-on-plato-after-gnt</id>
    <content type="html" xml:base="https://jktauber.com/2020/09/14/more-on-plato-after-gnt/">&lt;p&gt;In the &lt;a href=&#34;{% post_url 2020-09-02-plato-vocabulary-coverage-after-new-testament %}&#34;&gt;previous post&lt;/a&gt;, we looked at lemma and token coverage in the works of Plato assuming knowledge of Greek New Testament vocabulary. Here we graphically look at those results and make an important observation.&lt;/p&gt;
&lt;p&gt;For this first chart, I haven&#39;t just shown the GNT 100% and 80% but also the 98%, 95%, and 90% levels. The chart shows, assuming you&#39;ve learned a certain % of GNT lemmas, how many tokens in the works of Plato are from those lemmas plotted against the length of the Plato work. All the plots here are log-log because of the Zipfian nature of word distributions (although it is more important in subsequent plots than this one).&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;img src=&#34;/images/gnt_plato_2.png&#34; width=&#34;80%&#34;&gt;
&lt;/div&gt;

&lt;p&gt;At mentioned in the previous post, I was actually surprised at how little coverage drops off as a function of the length of the Plato work. A 100,000 token work has very similar token coverage than a 5,000 token work.&lt;/p&gt;
&lt;p&gt;Visually this can be seen in how horizontal the best-fit lines are above.&lt;/p&gt;
&lt;p&gt;However, when it comes to lemma coverage rather than token coverage, the story is very different:&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;img src=&#34;/images/gnt_plato_1.png&#34; width=&#34;80%&#34;&gt;
&lt;/div&gt;

&lt;p&gt;The drop-off above as the Plato work gets longer is quite dramatic (especially when you consider this is a log-log plot). The points fit quite well to a line, though, indicating how Zipfian the distribution is. This demonstrates the clear relationship between the length of the work and how many lemmas you&#39;re likely familiar with. The longer a work is, the more distinct lemmas it will use, although they tend to be low frequency within the work (hence how horiztonal the lines in the first chart are).&lt;/p&gt;
&lt;p&gt;Notice there &lt;em&gt;are&lt;/em&gt; some outliers—some works that seem to have higher coverage than their length would suggest given the best-fit line. I&#39;ve called out one here, showing just the GNT 80% points and best-fit line (although it&#39;s an outlier on the others too):&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;img src=&#34;/images/gnt_plato_3.png&#34; width=&#34;80%&#34;&gt;
&lt;/div&gt;

&lt;p&gt;This suggests that this work might be, in some sense, &lt;em&gt;easier&lt;/em&gt; for a GNT reader to read compared with other works of Plato. It suggests that perhaps the vocabulary of that particular work is closer to that of the GNT. The data was all there in the previous post but it&#39;s a lot easier to spot the outliers graphically.&lt;/p&gt;
&lt;p&gt;The work indicated above is &lt;em&gt;Parmenides&lt;/em&gt;. I started wonder what it was about that work that made it more &#34;GNT like&#34;.&lt;/p&gt;
&lt;p&gt;Then I took a step back because I realised there may be a confounding factor here. The statement &#34;this work might be &lt;em&gt;easier&lt;/em&gt; for a GNT reader to read compared with other works of Plato&#34; stands but note &lt;strong&gt;this might not be a property of any GNT/Parmenides shared vocabulary but rather just the word distribution in Parmenides itself&lt;/strong&gt;. In other words, Parmenides might just be &lt;em&gt;easier&lt;/em&gt; compared with other works of Plato and that might have nothing to do with any vocabulary similarity to the GNT.&lt;/p&gt;
&lt;p&gt;So I decided to just plot the token-to-lemma counts in the works of Plato. This doesn&#39;t involve the GNT at all, just how many tokens each work in Plato has versus how many unique lemmas that work has.&lt;/p&gt;
&lt;p&gt;Here is the result with &lt;em&gt;Parmenides&lt;/em&gt; called out:&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;img src=&#34;/images/gnt_plato_4.png&#34; width=&#34;80%&#34;&gt;
&lt;/div&gt;

&lt;p&gt;In other words, a large part (and maybe all) of why &lt;em&gt;Parmenides&lt;/em&gt; stands off the line in the coverage after GNT is because it simply has fewer lemmas for its overall token count. Its vocabulary is just smaller for its length.&lt;/p&gt;
&lt;p&gt;In fact, visually you can see that most of the deviations of works from the line in the early charts maps to corresponding deviations in this chart (which remember has nothing to do with the GNT).&lt;/p&gt;
&lt;p&gt;This is just some visual comparison. There are more quantative ways of actually measuring how much the deviations in the first three charts can be explained by those in the last chart. But I&#39;ll save that for another post.&lt;/p&gt;
&lt;p&gt;The important takeaway for now is that, to the extent some works of Plato might be easier to read after the GNT than others, this probably has little to do with any relationship between their vocabularies, and is more to do with the inherent token-to-lemma ratio of the target work of Plato. It is possible to separate out the effects of each, though, and I will explore that in the future.&lt;/p&gt;
&lt;p&gt;Note all the caveats I listed in my previous post about this data. Better lemmatization and richer vocabulary models are still needed.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In the &lt;a href=&#34;{% post_url 2020-09-02-plato-vocabulary-coverage-after-new-testament %}&#34;&gt;previous post&lt;/a&gt;, we looked at lemma and token coverage in the works of Plato assuming knowledge of Greek New Testament vocabulary. Here we graphically look at those results and make an important observation.</summary>
  </entry><entry>
    <title type="html">Plato Vocabulary Coverage After the New Testament</title>
    <link href="https://jktauber.com/2020/09/02/plato-vocabulary-coverage-after-new-testament/" rel="alternate" type="text/html" title="Plato Vocabulary Coverage After the New Testament"/>
    <published>2020-09-02T19:40:00+08:00</published>
    <updated>2020-09-02T19:40:00+08:00</updated>
    <id>https://jktauber.com/2020/09/02/plato-vocabulary-coverage-after-new-testament</id>
    <content type="html" xml:base="https://jktauber.com/2020/09/02/plato-vocabulary-coverage-after-new-testament/">&lt;p&gt;&lt;a href=&#34;https://thepatrologist.com&#34;&gt;Seumas Macdonald&lt;/a&gt; asked me about vocabulary coverage for each work of Plato assuming one has learnt the New Testament vocabulary.&lt;/p&gt;
&lt;p&gt;It turned out to be very simple to do with &lt;code&gt;vocabulary-tools&lt;/code&gt; and you can now see the script in the repo as &lt;a href=&#34;https://github.com/jtauber/vocabulary-tools/blob/master/examples3.py&#34;&gt;examples3.py&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But here let me share the results and give some caveats.&lt;/p&gt;
&lt;p&gt;In the table below:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;lemmas&lt;/strong&gt; is the number of unique lemmas in the work; e.g. Crito has 712 unique lemmas&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;tokens&lt;/strong&gt; is the number of total tokens in the work; e.g. Crito has 4,172 tokens&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GNT lemmas&lt;/strong&gt; is how many of those lemmas are in the GNT; e.g. 433 (of the 712) lemmas in Crito are also in the GNT&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GNT tokens&lt;/strong&gt; is how many of the total tokens in the work have lemmas in the GNT; e.g. 3,429 of the 4,172 tokens in Crito have lemmas in the GNT&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;% GNT lemmas&lt;/strong&gt; and &lt;strong&gt;% GNT tokens&lt;/strong&gt; just express those counts as percentages; e.g. 60.81% of the lemmas in Crito are in the GNT and 82.19% of tokens in Crito have lemmas seen in the GNT&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;th&gt;lemmas&lt;/th&gt;
&lt;th&gt;tokens&lt;/th&gt;
&lt;th&gt;GNT lemmas&lt;/th&gt;
&lt;th&gt;GNT tokens&lt;/th&gt;
&lt;th&gt;% GNT lemmas&lt;/th&gt;
&lt;th&gt;% GNT tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;001&lt;/td&gt;
&lt;td&gt;Euthyphro&lt;/td&gt;
&lt;td&gt;690&lt;/td&gt;
&lt;td&gt;5,181&lt;/td&gt;
&lt;td&gt;441&lt;/td&gt;
&lt;td&gt;4,274&lt;/td&gt;
&lt;td&gt;63.91%&lt;/td&gt;
&lt;td&gt;82.49%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;002&lt;/td&gt;
&lt;td&gt;Apology&lt;/td&gt;
&lt;td&gt;1,112&lt;/td&gt;
&lt;td&gt;8,745&lt;/td&gt;
&lt;td&gt;631&lt;/td&gt;
&lt;td&gt;7,357&lt;/td&gt;
&lt;td&gt;56.74%&lt;/td&gt;
&lt;td&gt;84.13%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;003&lt;/td&gt;
&lt;td&gt;Crito&lt;/td&gt;
&lt;td&gt;712&lt;/td&gt;
&lt;td&gt;4,172&lt;/td&gt;
&lt;td&gt;433&lt;/td&gt;
&lt;td&gt;3,429&lt;/td&gt;
&lt;td&gt;60.81%&lt;/td&gt;
&lt;td&gt;82.19%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;004&lt;/td&gt;
&lt;td&gt;Phaedo&lt;/td&gt;
&lt;td&gt;1,921&lt;/td&gt;
&lt;td&gt;21,825&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;18,033&lt;/td&gt;
&lt;td&gt;52.06%&lt;/td&gt;
&lt;td&gt;82.63%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;005&lt;/td&gt;
&lt;td&gt;Cratylus&lt;/td&gt;
&lt;td&gt;1,607&lt;/td&gt;
&lt;td&gt;17,944&lt;/td&gt;
&lt;td&gt;781&lt;/td&gt;
&lt;td&gt;14,701&lt;/td&gt;
&lt;td&gt;48.6%&lt;/td&gt;
&lt;td&gt;81.93%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;006&lt;/td&gt;
&lt;td&gt;Theaetetus&lt;/td&gt;
&lt;td&gt;2,072&lt;/td&gt;
&lt;td&gt;22,489&lt;/td&gt;
&lt;td&gt;966&lt;/td&gt;
&lt;td&gt;17,962&lt;/td&gt;
&lt;td&gt;46.62%&lt;/td&gt;
&lt;td&gt;79.87%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;007&lt;/td&gt;
&lt;td&gt;Sophist&lt;/td&gt;
&lt;td&gt;1,598&lt;/td&gt;
&lt;td&gt;16,024&lt;/td&gt;
&lt;td&gt;788&lt;/td&gt;
&lt;td&gt;12,932&lt;/td&gt;
&lt;td&gt;49.31%&lt;/td&gt;
&lt;td&gt;80.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;008&lt;/td&gt;
&lt;td&gt;Statesman&lt;/td&gt;
&lt;td&gt;2,013&lt;/td&gt;
&lt;td&gt;16,953&lt;/td&gt;
&lt;td&gt;937&lt;/td&gt;
&lt;td&gt;13,384&lt;/td&gt;
&lt;td&gt;46.55%&lt;/td&gt;
&lt;td&gt;78.95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;009&lt;/td&gt;
&lt;td&gt;Parmenides&lt;/td&gt;
&lt;td&gt;805&lt;/td&gt;
&lt;td&gt;15,155&lt;/td&gt;
&lt;td&gt;478&lt;/td&gt;
&lt;td&gt;12,738&lt;/td&gt;
&lt;td&gt;59.38%&lt;/td&gt;
&lt;td&gt;84.05%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;010&lt;/td&gt;
&lt;td&gt;Philebus&lt;/td&gt;
&lt;td&gt;1,567&lt;/td&gt;
&lt;td&gt;17,668&lt;/td&gt;
&lt;td&gt;800&lt;/td&gt;
&lt;td&gt;14,076&lt;/td&gt;
&lt;td&gt;51.05%&lt;/td&gt;
&lt;td&gt;79.67%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;011&lt;/td&gt;
&lt;td&gt;Symposium&lt;/td&gt;
&lt;td&gt;1,949&lt;/td&gt;
&lt;td&gt;17,461&lt;/td&gt;
&lt;td&gt;961&lt;/td&gt;
&lt;td&gt;13,806&lt;/td&gt;
&lt;td&gt;49.31%&lt;/td&gt;
&lt;td&gt;79.07%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;012&lt;/td&gt;
&lt;td&gt;Phaedrus&lt;/td&gt;
&lt;td&gt;2,266&lt;/td&gt;
&lt;td&gt;16,645&lt;/td&gt;
&lt;td&gt;1,027&lt;/td&gt;
&lt;td&gt;12,935&lt;/td&gt;
&lt;td&gt;45.32%&lt;/td&gt;
&lt;td&gt;77.71%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;013&lt;/td&gt;
&lt;td&gt;Alcibiades 1&lt;/td&gt;
&lt;td&gt;1,138&lt;/td&gt;
&lt;td&gt;10,264&lt;/td&gt;
&lt;td&gt;628&lt;/td&gt;
&lt;td&gt;8,356&lt;/td&gt;
&lt;td&gt;55.18%&lt;/td&gt;
&lt;td&gt;81.41%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;014&lt;/td&gt;
&lt;td&gt;Alcibiades 2&lt;/td&gt;
&lt;td&gt;711&lt;/td&gt;
&lt;td&gt;4,268&lt;/td&gt;
&lt;td&gt;420&lt;/td&gt;
&lt;td&gt;3,449&lt;/td&gt;
&lt;td&gt;59.07%&lt;/td&gt;
&lt;td&gt;80.81%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;015&lt;/td&gt;
&lt;td&gt;Hipparchus&lt;/td&gt;
&lt;td&gt;431&lt;/td&gt;
&lt;td&gt;2,256&lt;/td&gt;
&lt;td&gt;281&lt;/td&gt;
&lt;td&gt;1,890&lt;/td&gt;
&lt;td&gt;65.2%&lt;/td&gt;
&lt;td&gt;83.78%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;016&lt;/td&gt;
&lt;td&gt;Lovers&lt;/td&gt;
&lt;td&gt;473&lt;/td&gt;
&lt;td&gt;2,391&lt;/td&gt;
&lt;td&gt;284&lt;/td&gt;
&lt;td&gt;1,923&lt;/td&gt;
&lt;td&gt;60.04%&lt;/td&gt;
&lt;td&gt;80.43%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;017&lt;/td&gt;
&lt;td&gt;Theages&lt;/td&gt;
&lt;td&gt;627&lt;/td&gt;
&lt;td&gt;3,485&lt;/td&gt;
&lt;td&gt;374&lt;/td&gt;
&lt;td&gt;2,811&lt;/td&gt;
&lt;td&gt;59.65%&lt;/td&gt;
&lt;td&gt;80.66%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;018&lt;/td&gt;
&lt;td&gt;Charmides&lt;/td&gt;
&lt;td&gt;919&lt;/td&gt;
&lt;td&gt;8,311&lt;/td&gt;
&lt;td&gt;534&lt;/td&gt;
&lt;td&gt;6,875&lt;/td&gt;
&lt;td&gt;58.11%&lt;/td&gt;
&lt;td&gt;82.72%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;019&lt;/td&gt;
&lt;td&gt;Laches&lt;/td&gt;
&lt;td&gt;960&lt;/td&gt;
&lt;td&gt;7,674&lt;/td&gt;
&lt;td&gt;559&lt;/td&gt;
&lt;td&gt;6,100&lt;/td&gt;
&lt;td&gt;58.23%&lt;/td&gt;
&lt;td&gt;79.49%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;020&lt;/td&gt;
&lt;td&gt;Lysis&lt;/td&gt;
&lt;td&gt;911&lt;/td&gt;
&lt;td&gt;6,980&lt;/td&gt;
&lt;td&gt;524&lt;/td&gt;
&lt;td&gt;5,729&lt;/td&gt;
&lt;td&gt;57.52%&lt;/td&gt;
&lt;td&gt;82.08%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;021&lt;/td&gt;
&lt;td&gt;Euthydemus&lt;/td&gt;
&lt;td&gt;1,268&lt;/td&gt;
&lt;td&gt;12,453&lt;/td&gt;
&lt;td&gt;686&lt;/td&gt;
&lt;td&gt;10,015&lt;/td&gt;
&lt;td&gt;54.1%&lt;/td&gt;
&lt;td&gt;80.42%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;022&lt;/td&gt;
&lt;td&gt;Protagoras&lt;/td&gt;
&lt;td&gt;1,753&lt;/td&gt;
&lt;td&gt;17,795&lt;/td&gt;
&lt;td&gt;869&lt;/td&gt;
&lt;td&gt;14,306&lt;/td&gt;
&lt;td&gt;49.57%&lt;/td&gt;
&lt;td&gt;80.39%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;023&lt;/td&gt;
&lt;td&gt;Gorgias&lt;/td&gt;
&lt;td&gt;1,938&lt;/td&gt;
&lt;td&gt;26,337&lt;/td&gt;
&lt;td&gt;951&lt;/td&gt;
&lt;td&gt;21,467&lt;/td&gt;
&lt;td&gt;49.07%&lt;/td&gt;
&lt;td&gt;81.51%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;024&lt;/td&gt;
&lt;td&gt;Meno&lt;/td&gt;
&lt;td&gt;961&lt;/td&gt;
&lt;td&gt;9,791&lt;/td&gt;
&lt;td&gt;534&lt;/td&gt;
&lt;td&gt;8,066&lt;/td&gt;
&lt;td&gt;55.57%&lt;/td&gt;
&lt;td&gt;82.38%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;025&lt;/td&gt;
&lt;td&gt;Hippias Major&lt;/td&gt;
&lt;td&gt;958&lt;/td&gt;
&lt;td&gt;8,448&lt;/td&gt;
&lt;td&gt;528&lt;/td&gt;
&lt;td&gt;6,730&lt;/td&gt;
&lt;td&gt;55.11%&lt;/td&gt;
&lt;td&gt;79.66%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;026&lt;/td&gt;
&lt;td&gt;Hippias Minor&lt;/td&gt;
&lt;td&gt;698&lt;/td&gt;
&lt;td&gt;4,360&lt;/td&gt;
&lt;td&gt;396&lt;/td&gt;
&lt;td&gt;3,387&lt;/td&gt;
&lt;td&gt;56.73%&lt;/td&gt;
&lt;td&gt;77.68%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;027&lt;/td&gt;
&lt;td&gt;Ion&lt;/td&gt;
&lt;td&gt;721&lt;/td&gt;
&lt;td&gt;4,024&lt;/td&gt;
&lt;td&gt;382&lt;/td&gt;
&lt;td&gt;3,012&lt;/td&gt;
&lt;td&gt;52.98%&lt;/td&gt;
&lt;td&gt;74.85%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;028&lt;/td&gt;
&lt;td&gt;Menexenus&lt;/td&gt;
&lt;td&gt;958&lt;/td&gt;
&lt;td&gt;4,808&lt;/td&gt;
&lt;td&gt;571&lt;/td&gt;
&lt;td&gt;3,985&lt;/td&gt;
&lt;td&gt;59.6%&lt;/td&gt;
&lt;td&gt;82.88%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;029&lt;/td&gt;
&lt;td&gt;Cleitophon&lt;/td&gt;
&lt;td&gt;418&lt;/td&gt;
&lt;td&gt;1,549&lt;/td&gt;
&lt;td&gt;284&lt;/td&gt;
&lt;td&gt;1,293&lt;/td&gt;
&lt;td&gt;67.94%&lt;/td&gt;
&lt;td&gt;83.47%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;030&lt;/td&gt;
&lt;td&gt;Republic&lt;/td&gt;
&lt;td&gt;4,846&lt;/td&gt;
&lt;td&gt;88,878&lt;/td&gt;
&lt;td&gt;1,782&lt;/td&gt;
&lt;td&gt;71,377&lt;/td&gt;
&lt;td&gt;36.77%&lt;/td&gt;
&lt;td&gt;80.31%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;031&lt;/td&gt;
&lt;td&gt;Timaeus&lt;/td&gt;
&lt;td&gt;2,666&lt;/td&gt;
&lt;td&gt;23,662&lt;/td&gt;
&lt;td&gt;1,122&lt;/td&gt;
&lt;td&gt;18,644&lt;/td&gt;
&lt;td&gt;42.09%&lt;/td&gt;
&lt;td&gt;78.79%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;032&lt;/td&gt;
&lt;td&gt;Critias&lt;/td&gt;
&lt;td&gt;1,130&lt;/td&gt;
&lt;td&gt;4,950&lt;/td&gt;
&lt;td&gt;638&lt;/td&gt;
&lt;td&gt;3,997&lt;/td&gt;
&lt;td&gt;56.46%&lt;/td&gt;
&lt;td&gt;80.75%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;033&lt;/td&gt;
&lt;td&gt;Minos&lt;/td&gt;
&lt;td&gt;528&lt;/td&gt;
&lt;td&gt;2,859&lt;/td&gt;
&lt;td&gt;309&lt;/td&gt;
&lt;td&gt;2,333&lt;/td&gt;
&lt;td&gt;58.52%&lt;/td&gt;
&lt;td&gt;81.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;034&lt;/td&gt;
&lt;td&gt;Laws&lt;/td&gt;
&lt;td&gt;5,227&lt;/td&gt;
&lt;td&gt;103,193&lt;/td&gt;
&lt;td&gt;1,804&lt;/td&gt;
&lt;td&gt;82,652&lt;/td&gt;
&lt;td&gt;34.51%&lt;/td&gt;
&lt;td&gt;80.09%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;035&lt;/td&gt;
&lt;td&gt;Epinomis&lt;/td&gt;
&lt;td&gt;1,014&lt;/td&gt;
&lt;td&gt;6,309&lt;/td&gt;
&lt;td&gt;590&lt;/td&gt;
&lt;td&gt;5,135&lt;/td&gt;
&lt;td&gt;58.19%&lt;/td&gt;
&lt;td&gt;81.39%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;036&lt;/td&gt;
&lt;td&gt;Epistles&lt;/td&gt;
&lt;td&gt;2,026&lt;/td&gt;
&lt;td&gt;16,964&lt;/td&gt;
&lt;td&gt;1,015&lt;/td&gt;
&lt;td&gt;13,768&lt;/td&gt;
&lt;td&gt;50.1%&lt;/td&gt;
&lt;td&gt;81.16%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;It&#39;s encouraging how any works are above the 80% level. Here are some caveats, though:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;the Plato lemmatization is from Diorisis so has not been checked and may have errors throwing things off&lt;/li&gt;
&lt;li&gt;the GNT lemmatization is MorphGNT and so even if Diorisis got it “right” it may have a different lemmatization scheme than MorphGNT&lt;/li&gt;
&lt;li&gt;this assumes 100% knowledge of GNT lemmas&lt;/li&gt;
&lt;li&gt;this doesn’t take into account word families nor individual forms&lt;/li&gt;
&lt;li&gt;this coverage calculation theoretically favours shorter works. You can see that in how much lower the % GNT lemmas is for longer works like the &lt;em&gt;Laws&lt;/em&gt; and &lt;em&gt;Republic&lt;/em&gt; although (perhaps significantly) this doesn’t actually seem to skew token coverage&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Favouring shorter works isn&#39;t necessary a bad thing if the goal is to find the most readable (by vocabulary) works of Plato post-GNT.&lt;/p&gt;
&lt;p&gt;Here&#39;s a run of the code only assuming the 80% level of GNT vocabulary rather than the whole thing.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;th&gt;lemmas&lt;/th&gt;
&lt;th&gt;tokens&lt;/th&gt;
&lt;th&gt;GNT lemmas&lt;/th&gt;
&lt;th&gt;GNT tokens&lt;/th&gt;
&lt;th&gt;% GNT lemmas&lt;/th&gt;
&lt;th&gt;% GNT tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;001&lt;/td&gt;
&lt;td&gt;Euthyphro&lt;/td&gt;
&lt;td&gt;690&lt;/td&gt;
&lt;td&gt;5,181&lt;/td&gt;
&lt;td&gt;149&lt;/td&gt;
&lt;td&gt;3,135&lt;/td&gt;
&lt;td&gt;21.59%&lt;/td&gt;
&lt;td&gt;60.51%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;002&lt;/td&gt;
&lt;td&gt;Apology&lt;/td&gt;
&lt;td&gt;1,112&lt;/td&gt;
&lt;td&gt;8,745&lt;/td&gt;
&lt;td&gt;165&lt;/td&gt;
&lt;td&gt;5,551&lt;/td&gt;
&lt;td&gt;14.84%&lt;/td&gt;
&lt;td&gt;63.48%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;003&lt;/td&gt;
&lt;td&gt;Crito&lt;/td&gt;
&lt;td&gt;712&lt;/td&gt;
&lt;td&gt;4,172&lt;/td&gt;
&lt;td&gt;150&lt;/td&gt;
&lt;td&gt;2,581&lt;/td&gt;
&lt;td&gt;21.07%&lt;/td&gt;
&lt;td&gt;61.86%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;004&lt;/td&gt;
&lt;td&gt;Phaedo&lt;/td&gt;
&lt;td&gt;1,921&lt;/td&gt;
&lt;td&gt;21,825&lt;/td&gt;
&lt;td&gt;214&lt;/td&gt;
&lt;td&gt;13,647&lt;/td&gt;
&lt;td&gt;11.14%&lt;/td&gt;
&lt;td&gt;62.53%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;005&lt;/td&gt;
&lt;td&gt;Cratylus&lt;/td&gt;
&lt;td&gt;1,607&lt;/td&gt;
&lt;td&gt;17,944&lt;/td&gt;
&lt;td&gt;192&lt;/td&gt;
&lt;td&gt;11,208&lt;/td&gt;
&lt;td&gt;11.95%&lt;/td&gt;
&lt;td&gt;62.46%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;006&lt;/td&gt;
&lt;td&gt;Theaetetus&lt;/td&gt;
&lt;td&gt;2,072&lt;/td&gt;
&lt;td&gt;22,489&lt;/td&gt;
&lt;td&gt;215&lt;/td&gt;
&lt;td&gt;13,416&lt;/td&gt;
&lt;td&gt;10.38%&lt;/td&gt;
&lt;td&gt;59.66%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;007&lt;/td&gt;
&lt;td&gt;Sophist&lt;/td&gt;
&lt;td&gt;1,598&lt;/td&gt;
&lt;td&gt;16,024&lt;/td&gt;
&lt;td&gt;183&lt;/td&gt;
&lt;td&gt;9,644&lt;/td&gt;
&lt;td&gt;11.45%&lt;/td&gt;
&lt;td&gt;60.18%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;008&lt;/td&gt;
&lt;td&gt;Statesman&lt;/td&gt;
&lt;td&gt;2,013&lt;/td&gt;
&lt;td&gt;16,953&lt;/td&gt;
&lt;td&gt;194&lt;/td&gt;
&lt;td&gt;9,577&lt;/td&gt;
&lt;td&gt;9.64%&lt;/td&gt;
&lt;td&gt;56.49%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;009&lt;/td&gt;
&lt;td&gt;Parmenides&lt;/td&gt;
&lt;td&gt;805&lt;/td&gt;
&lt;td&gt;15,155&lt;/td&gt;
&lt;td&gt;140&lt;/td&gt;
&lt;td&gt;9,852&lt;/td&gt;
&lt;td&gt;17.39%&lt;/td&gt;
&lt;td&gt;65.01%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;010&lt;/td&gt;
&lt;td&gt;Philebus&lt;/td&gt;
&lt;td&gt;1,567&lt;/td&gt;
&lt;td&gt;17,668&lt;/td&gt;
&lt;td&gt;187&lt;/td&gt;
&lt;td&gt;10,209&lt;/td&gt;
&lt;td&gt;11.93%&lt;/td&gt;
&lt;td&gt;57.78%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;011&lt;/td&gt;
&lt;td&gt;Symposium&lt;/td&gt;
&lt;td&gt;1,949&lt;/td&gt;
&lt;td&gt;17,461&lt;/td&gt;
&lt;td&gt;208&lt;/td&gt;
&lt;td&gt;10,437&lt;/td&gt;
&lt;td&gt;10.67%&lt;/td&gt;
&lt;td&gt;59.77%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;012&lt;/td&gt;
&lt;td&gt;Phaedrus&lt;/td&gt;
&lt;td&gt;2,266&lt;/td&gt;
&lt;td&gt;16,645&lt;/td&gt;
&lt;td&gt;212&lt;/td&gt;
&lt;td&gt;9,395&lt;/td&gt;
&lt;td&gt;9.36%&lt;/td&gt;
&lt;td&gt;56.44%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;013&lt;/td&gt;
&lt;td&gt;Alcibiades 1&lt;/td&gt;
&lt;td&gt;1,138&lt;/td&gt;
&lt;td&gt;10,264&lt;/td&gt;
&lt;td&gt;177&lt;/td&gt;
&lt;td&gt;6,296&lt;/td&gt;
&lt;td&gt;15.55%&lt;/td&gt;
&lt;td&gt;61.34%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;014&lt;/td&gt;
&lt;td&gt;Alcibiades 2&lt;/td&gt;
&lt;td&gt;711&lt;/td&gt;
&lt;td&gt;4,268&lt;/td&gt;
&lt;td&gt;142&lt;/td&gt;
&lt;td&gt;2,566&lt;/td&gt;
&lt;td&gt;19.97%&lt;/td&gt;
&lt;td&gt;60.12%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;015&lt;/td&gt;
&lt;td&gt;Hipparchus&lt;/td&gt;
&lt;td&gt;431&lt;/td&gt;
&lt;td&gt;2,256&lt;/td&gt;
&lt;td&gt;111&lt;/td&gt;
&lt;td&gt;1,339&lt;/td&gt;
&lt;td&gt;25.75%&lt;/td&gt;
&lt;td&gt;59.35%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;016&lt;/td&gt;
&lt;td&gt;Lovers&lt;/td&gt;
&lt;td&gt;473&lt;/td&gt;
&lt;td&gt;2,391&lt;/td&gt;
&lt;td&gt;104&lt;/td&gt;
&lt;td&gt;1,427&lt;/td&gt;
&lt;td&gt;21.99%&lt;/td&gt;
&lt;td&gt;59.68%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;017&lt;/td&gt;
&lt;td&gt;Theages&lt;/td&gt;
&lt;td&gt;627&lt;/td&gt;
&lt;td&gt;3,485&lt;/td&gt;
&lt;td&gt;124&lt;/td&gt;
&lt;td&gt;2,129&lt;/td&gt;
&lt;td&gt;19.78%&lt;/td&gt;
&lt;td&gt;61.09%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;018&lt;/td&gt;
&lt;td&gt;Charmides&lt;/td&gt;
&lt;td&gt;919&lt;/td&gt;
&lt;td&gt;8,311&lt;/td&gt;
&lt;td&gt;158&lt;/td&gt;
&lt;td&gt;5,277&lt;/td&gt;
&lt;td&gt;17.19%&lt;/td&gt;
&lt;td&gt;63.49%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;019&lt;/td&gt;
&lt;td&gt;Laches&lt;/td&gt;
&lt;td&gt;960&lt;/td&gt;
&lt;td&gt;7,674&lt;/td&gt;
&lt;td&gt;165&lt;/td&gt;
&lt;td&gt;4,632&lt;/td&gt;
&lt;td&gt;17.19%&lt;/td&gt;
&lt;td&gt;60.36%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;020&lt;/td&gt;
&lt;td&gt;Lysis&lt;/td&gt;
&lt;td&gt;911&lt;/td&gt;
&lt;td&gt;6,980&lt;/td&gt;
&lt;td&gt;150&lt;/td&gt;
&lt;td&gt;4,204&lt;/td&gt;
&lt;td&gt;16.47%&lt;/td&gt;
&lt;td&gt;60.23%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;021&lt;/td&gt;
&lt;td&gt;Euthydemus&lt;/td&gt;
&lt;td&gt;1,268&lt;/td&gt;
&lt;td&gt;12,453&lt;/td&gt;
&lt;td&gt;181&lt;/td&gt;
&lt;td&gt;7,640&lt;/td&gt;
&lt;td&gt;14.27%&lt;/td&gt;
&lt;td&gt;61.35%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;022&lt;/td&gt;
&lt;td&gt;Protagoras&lt;/td&gt;
&lt;td&gt;1,753&lt;/td&gt;
&lt;td&gt;17,795&lt;/td&gt;
&lt;td&gt;195&lt;/td&gt;
&lt;td&gt;10,973&lt;/td&gt;
&lt;td&gt;11.12%&lt;/td&gt;
&lt;td&gt;61.66%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;023&lt;/td&gt;
&lt;td&gt;Gorgias&lt;/td&gt;
&lt;td&gt;1,938&lt;/td&gt;
&lt;td&gt;26,337&lt;/td&gt;
&lt;td&gt;205&lt;/td&gt;
&lt;td&gt;16,301&lt;/td&gt;
&lt;td&gt;10.58%&lt;/td&gt;
&lt;td&gt;61.89%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;024&lt;/td&gt;
&lt;td&gt;Meno&lt;/td&gt;
&lt;td&gt;961&lt;/td&gt;
&lt;td&gt;9,791&lt;/td&gt;
&lt;td&gt;159&lt;/td&gt;
&lt;td&gt;6,042&lt;/td&gt;
&lt;td&gt;16.55%&lt;/td&gt;
&lt;td&gt;61.71%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;025&lt;/td&gt;
&lt;td&gt;Hippias Major&lt;/td&gt;
&lt;td&gt;958&lt;/td&gt;
&lt;td&gt;8,448&lt;/td&gt;
&lt;td&gt;154&lt;/td&gt;
&lt;td&gt;5,123&lt;/td&gt;
&lt;td&gt;16.08%&lt;/td&gt;
&lt;td&gt;60.64%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;026&lt;/td&gt;
&lt;td&gt;Hippias Minor&lt;/td&gt;
&lt;td&gt;698&lt;/td&gt;
&lt;td&gt;4,360&lt;/td&gt;
&lt;td&gt;134&lt;/td&gt;
&lt;td&gt;2,446&lt;/td&gt;
&lt;td&gt;19.2%&lt;/td&gt;
&lt;td&gt;56.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;027&lt;/td&gt;
&lt;td&gt;Ion&lt;/td&gt;
&lt;td&gt;721&lt;/td&gt;
&lt;td&gt;4,024&lt;/td&gt;
&lt;td&gt;133&lt;/td&gt;
&lt;td&gt;2,236&lt;/td&gt;
&lt;td&gt;18.45%&lt;/td&gt;
&lt;td&gt;55.57%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;028&lt;/td&gt;
&lt;td&gt;Menexenus&lt;/td&gt;
&lt;td&gt;958&lt;/td&gt;
&lt;td&gt;4,808&lt;/td&gt;
&lt;td&gt;161&lt;/td&gt;
&lt;td&gt;2,877&lt;/td&gt;
&lt;td&gt;16.81%&lt;/td&gt;
&lt;td&gt;59.84%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;029&lt;/td&gt;
&lt;td&gt;Cleitophon&lt;/td&gt;
&lt;td&gt;418&lt;/td&gt;
&lt;td&gt;1,549&lt;/td&gt;
&lt;td&gt;113&lt;/td&gt;
&lt;td&gt;966&lt;/td&gt;
&lt;td&gt;27.03%&lt;/td&gt;
&lt;td&gt;62.36%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;030&lt;/td&gt;
&lt;td&gt;Republic&lt;/td&gt;
&lt;td&gt;4,846&lt;/td&gt;
&lt;td&gt;88,878&lt;/td&gt;
&lt;td&gt;252&lt;/td&gt;
&lt;td&gt;53,090&lt;/td&gt;
&lt;td&gt;5.2%&lt;/td&gt;
&lt;td&gt;59.73%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;031&lt;/td&gt;
&lt;td&gt;Timaeus&lt;/td&gt;
&lt;td&gt;2,666&lt;/td&gt;
&lt;td&gt;23,662&lt;/td&gt;
&lt;td&gt;210&lt;/td&gt;
&lt;td&gt;13,555&lt;/td&gt;
&lt;td&gt;7.88%&lt;/td&gt;
&lt;td&gt;57.29%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;032&lt;/td&gt;
&lt;td&gt;Critias&lt;/td&gt;
&lt;td&gt;1,130&lt;/td&gt;
&lt;td&gt;4,950&lt;/td&gt;
&lt;td&gt;171&lt;/td&gt;
&lt;td&gt;2,872&lt;/td&gt;
&lt;td&gt;15.13%&lt;/td&gt;
&lt;td&gt;58.02%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;033&lt;/td&gt;
&lt;td&gt;Minos&lt;/td&gt;
&lt;td&gt;528&lt;/td&gt;
&lt;td&gt;2,859&lt;/td&gt;
&lt;td&gt;121&lt;/td&gt;
&lt;td&gt;1,776&lt;/td&gt;
&lt;td&gt;22.92%&lt;/td&gt;
&lt;td&gt;62.12%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;034&lt;/td&gt;
&lt;td&gt;Laws&lt;/td&gt;
&lt;td&gt;5,227&lt;/td&gt;
&lt;td&gt;103,193&lt;/td&gt;
&lt;td&gt;250&lt;/td&gt;
&lt;td&gt;58,891&lt;/td&gt;
&lt;td&gt;4.78%&lt;/td&gt;
&lt;td&gt;57.07%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;035&lt;/td&gt;
&lt;td&gt;Epinomis&lt;/td&gt;
&lt;td&gt;1,014&lt;/td&gt;
&lt;td&gt;6,309&lt;/td&gt;
&lt;td&gt;165&lt;/td&gt;
&lt;td&gt;3,700&lt;/td&gt;
&lt;td&gt;16.27%&lt;/td&gt;
&lt;td&gt;58.65%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;036&lt;/td&gt;
&lt;td&gt;Epistles&lt;/td&gt;
&lt;td&gt;2,026&lt;/td&gt;
&lt;td&gt;16,964&lt;/td&gt;
&lt;td&gt;211&lt;/td&gt;
&lt;td&gt;10,229&lt;/td&gt;
&lt;td&gt;10.41%&lt;/td&gt;
&lt;td&gt;60.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The Plato coverage generally drops from around 80% to 60% which suggests it might be worth &#34;topping up&#34; one&#39;s vocabulary with some common Plato words not in the GNT before embarking on a specific work. It would be easy to generate such a list with &lt;code&gt;vocabulary-tools&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But it was quite striking to me in both tables just how little the token % drops due to length (in contrast to the lemma %).&lt;/p&gt;
&lt;p&gt;This just goes to show that longer works introduce a lot of new words but very sparsely (probably with only one occurrence in many cases).&lt;/p&gt;
&lt;p&gt;I might explore that graphically in a follow-up post.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">&lt;a href=&#34;https://thepatrologist.com&#34;&gt;Seumas Macdonald&lt;/a&gt; asked me about vocabulary coverage for each work of Plato assuming one has learnt the New Testament vocabulary.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 48</title>
    <link href="https://jktauber.com/2020/08/30/a-tour-of-greek-morphology-part-48/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 48"/>
    <published>2020-08-30T21:12:00+08:00</published>
    <updated>2020-08-30T21:12:00+08:00</updated>
    <id>https://jktauber.com/2020/08/30/a-tour-of-greek-morphology-part-48</id>
    <content type="html" xml:base="https://jktauber.com/2020/08/30/a-tour-of-greek-morphology-part-48/">&lt;p&gt;Part forty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;We previously introduced the (θ)η-aorists. In this post, we&#39;ll mention the stem variants and then go over some counts.&lt;/p&gt;
&lt;p&gt;In terms of stem variants, we first of all have δέω, where we find the infinitive δεθῆναι alongside the &lt;strong&gt;1SG&lt;/strong&gt; ἐδεήθην and &lt;strong&gt;3SG&lt;/strong&gt; ἐδεήθη. The infinitive form suggests a stem of δε-θη whereas the finite forms suggest a stem of ἐ-δεη-θη with an extra η.&lt;/p&gt;
&lt;p&gt;Secondly, we have two &lt;strong&gt;3SG&lt;/strong&gt; forms of ἁρπάζω: ἡρπάγη and ἡρπάσθη.&lt;/p&gt;
&lt;p&gt;Finally we have ἀνοίγω with its confused augmentation (which we&#39;ve seen in other aorists) and also both a θ and non-θ form. Putting aside the ἠνοι- vs ἀνεῳ- vs ἠνεῳ- variation, we have &lt;strong&gt;3SG&lt;/strong&gt; ἠνοίχθη alongside ἠνοίγη and &lt;strong&gt;3PL&lt;/strong&gt; ἠνοίχθησαν alongside ἠνοίγησαν.&lt;/p&gt;
&lt;p&gt;Notice that in both the ἁρπάζω and ἀνοίγω cases, we have a non-θ form with γ before the η. We&#39;ll look at the letters we find before η and θη later in this post.&lt;/p&gt;
&lt;p&gt;But first let&#39;s do our usual counts of tokens and lemmas.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;class&lt;/th&gt;
&lt;th&gt;# lemmas&lt;/th&gt;
&lt;th&gt;# tokens&lt;/th&gt;
&lt;th&gt;# hapakes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;-θη-&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;250&lt;/td&gt;
&lt;td&gt;954&lt;/td&gt;
&lt;td&gt;130&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;-η-&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;td&gt;79&lt;/td&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;As one can see, the non-θ forms are more rare lexically and the lexemes that do take them occur less frequently. They both, however, seem productive.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;-θη-&lt;/th&gt;
&lt;th&gt;-η-&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;166&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;489&lt;/td&gt;
&lt;td&gt;43&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;44&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;188&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The distribution above is what we might expect except for the &lt;strong&gt;INF&lt;/strong&gt; which are disproportionately -θη-. This is not due to a single lexical item (unlike the &lt;strong&gt;3SG&lt;/strong&gt; where ἀπεκρίθη dominates).&lt;/p&gt;
&lt;p&gt;This will be worth further investigation but we have other things to cover first. For example, is there any phonological reason why a non-θ form might be used rather than a θ-form? We saw &lt;a href=&#34;https://jktauber.com/2020/03/17/a-tour-of-greek-morphology-part-41/&#34;&gt;previously&lt;/a&gt;, for example, that the existence or absence of the sigma in the alphathematic aorists was largely (although not entirely) predicted by the preceding letter.&lt;/p&gt;
&lt;p&gt;It turns out, at least in our text (we&#39;ll look more broadly later) there&#39;s quite a strong correlation between whether a θ is found or not and what the preceding letter is.&lt;/p&gt;
&lt;p&gt;For example, if the preceding letter is any of the vowels ε η ι ο υ ω, then we always find the θη form in the SBLGNT. α is the only exception and even then only in one lexical item out of 14, the κατακαίω form κατεκάη. (Notably κατεκαύ(σ)θη is more common elsewhere but we&#39;ll have to wait a little to discuss καίω forms in general)&lt;/p&gt;
&lt;p&gt;If the preceding letter is σ, then we &lt;em&gt;always&lt;/em&gt; find the θη form. This is actually the most likely letter to precede θ by far, followed by η.&lt;/p&gt;
&lt;p&gt;ξ ψ and ζ don&#39;t appear in (θ)η aorists in the SBLGNT. Nor do δ τ or θ.&lt;/p&gt;
&lt;p&gt;Amongst the velars: κ doesn&#39;t appear in (θ)η aorists in the SBLGNT but γ and χ both do. γ is always followed directly by η (and in fact the bigram γθ never appears in the SBLGNT at all). In contrast, χ always takes the θ form (which &lt;em&gt;might&lt;/em&gt; be explained by an underlying κ or γ becoming χ because of the following θ but this doesn&#39;t explain why the θ would be absent in the -γ-η- instances).&lt;/p&gt;
&lt;p&gt;Amongst the bilabials: both π and β are always followed directly by η (and neither πθ nor βθ appear as bigrams in the SBLGNT). φ however is found both in θη and η forms with a slight preference for φθη over φη.&lt;/p&gt;
&lt;p&gt;This leaves our resonances: the liquids λ and ρ, and the nasals μ and ν. The bigram λθ is definitely allowed in Greek but we only find -λ-η- aorists, not -λ-θη-. With ν and ρ we find both θ and non-θ forms. There are no μ examples in the SBLGNT, nor do we find the bigram μθ.&lt;/p&gt;
&lt;p&gt;Here&#39;s a summary with lexeme counts:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;-θη-&lt;/th&gt;
&lt;th&gt;-η-&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;α&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ε&lt;/td&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;η&lt;/td&gt;
&lt;td&gt;80&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ι&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ο&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;υ&lt;/td&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ω&lt;/td&gt;
&lt;td&gt;52&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;σ&lt;/td&gt;
&lt;td&gt;108&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ξ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ψ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ζ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;τ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;δ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;θ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;κ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;γ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;χ&lt;/td&gt;
&lt;td&gt;37&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;β&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;π&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;φ&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;λ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ρ&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;μ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ν&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Clearly there are some patterns here. Vowels, σ, and the aspirated stops strongly (or even entirely) favour -θη-. The non-aspirated stops seem to entirely favour a plain η. The resonances are a mixed bag.&lt;/p&gt;
&lt;p&gt;There are definitely some correlations but it&#39;s unclear what the casual relationship is. And it raises the important question of where the letter before the θ (or η) comes from in the first place. This relates more broadly to the question of the aorist stem. What is the relationship between the aorist stems used in the active, middle and (θ)η forms? In the next post, we&#39;ll start to explore that. Then, after reviewing all our endings so far, we&#39;ll move on to the even bigger question: what&#39;s the relationship between the aorist stem and the present stem?&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 47</title>
    <link href="https://jktauber.com/2020/08/19/a-tour-of-greek-morphology-part-47/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 47"/>
    <published>2020-08-19T12:20:00+08:00</published>
    <updated>2020-08-19T12:20:00+08:00</updated>
    <id>https://jktauber.com/2020/08/19/a-tour-of-greek-morphology-part-47</id>
    <content type="html" xml:base="https://jktauber.com/2020/08/19/a-tour-of-greek-morphology-part-47/">&lt;p&gt;Part forty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;We now turn to the (θ)η-aorists. These are often called aorist &#39;passives&#39; but this is an unhelpful and confusing term. When talking about the &lt;em&gt;form&lt;/em&gt;, it&#39;s better to give a label that simply refers to the form itself rather than to one of the functions that form may or (often) may not be used for. Naming the form for &lt;em&gt;one&lt;/em&gt; of its functions (especially when other forms can be used for the same function) runs the risk of overemphasizing that function and somehow treating other functions as anomalies.&lt;/p&gt;
&lt;p&gt;We must be clear, though, that &#34;(θ)η-aorist&#34; is not a category like &#34;root aorist&#34; or &#34;thematic aorist&#34; or &#34;sigmatic aorist&#34; where different lexemes fall (in most cases exclusively) into just one of those categories without there necessarily being a morphsyntactic distinction. The (θ)η-aorist is a new paradigm available to verbs for expressing a certain voice in contrast to the active and middle forms that we&#39;ve already seen.&lt;/p&gt;
&lt;p&gt;A lot more could be said about all this but that&#39;s outside the scope of a tour of morphological &lt;em&gt;forms&lt;/em&gt;. The main point is that aorists can come in three voice-contrasting paradigms.&lt;/p&gt;
&lt;p&gt;Three of the most common (θ)η-aorists in the New Testament, with broad coverage across personal endings are γενηθῆναι, ἀποκριθῆναι, and χαρῆναι.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;γίνομαι&lt;/th&gt;
&lt;th&gt;ἀποκρίνομαι&lt;/th&gt;
&lt;th&gt;χαίρω&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;γενηθῆναι&lt;/td&gt;
&lt;td&gt;ἀποκριθῆναι&lt;/td&gt;
&lt;td&gt;χαρῆναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐγενήθην&lt;/td&gt;
&lt;td&gt;ἀπεκρίθην&lt;/td&gt;
&lt;td&gt;ἐχάρην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἀπεκρίθης&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐγενήθη&lt;/td&gt;
&lt;td&gt;ἀπεκρίθη&lt;/td&gt;
&lt;td&gt;ἐχάρη&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐγενήθημεν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐχάρημεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐγενήθητε&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐχάρητε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐγενήθησαν&lt;/td&gt;
&lt;td&gt;ἀπεκρίθησαν&lt;/td&gt;
&lt;td&gt;ἐχάρησαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;All the above forms appear in the SBLGNT.&lt;/p&gt;
&lt;p&gt;The &#34;vertical&#34; distinguishers are our familiar endings seen in the root aorist actives:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-ναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-ν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-ς&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-σαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &#34;horizontal&#34; distinguishers, however, look like this:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-(θ)ῆναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-(θ)ην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-(θ)ης&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-(θ)η&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-(θ)ημεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-(θ)ητε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-(θ)ησαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The whole category always has a -η- before the ending and most often a -θη-, hence the name (θ)η-aorist.&lt;/p&gt;
&lt;p&gt;By far the most common form in the SBLGNT is ἀπεκρίθη (82 tokens). The plural ἀπεκρίθησαν is the third most common form (19 tokens). The second most common form is ἐδόθη (31 tokens).&lt;/p&gt;
&lt;p&gt;The next post will look at further counts of these (θ)η-aorists and then we&#39;ll look at the relationship between aorist active, middle and (θ)η forms before moving on to the large question of the relationship between perfective and imperfective forms.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 46</title>
    <link href="https://jktauber.com/2020/08/15/a-tour-of-greek-morphology-part-46/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 46"/>
    <published>2020-08-15T14:00:00+08:00</published>
    <updated>2020-08-15T14:00:00+08:00</updated>
    <id>https://jktauber.com/2020/08/15/a-tour-of-greek-morphology-part-46</id>
    <content type="html" xml:base="https://jktauber.com/2020/08/15/a-tour-of-greek-morphology-part-46/">&lt;p&gt;Part forty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;We saw in &lt;a href=&#34;{% post_url 2020-05-10-a-tour-of-greek-morphology-part-42 %}&#34;&gt;Part 42&lt;/a&gt; that the aorist middle endings were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;-σθαι&lt;/li&gt;
&lt;li&gt;-μην&lt;/li&gt;
&lt;li&gt;-σο (often with loss of sigma and subsequent contraction)&lt;/li&gt;
&lt;li&gt;-το&lt;/li&gt;
&lt;li&gt;-μεθα&lt;/li&gt;
&lt;li&gt;-σθε&lt;/li&gt;
&lt;li&gt;-ντο&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;preceded by alpha&lt;/li&gt;
&lt;li&gt;preceded by a theme vowel ε/ο&lt;/li&gt;
&lt;li&gt;affixed directly&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;which correspond to our classes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;alphathematic (first aorists)&lt;/li&gt;
&lt;li&gt;thematic (second aorists)&lt;/li&gt;
&lt;li&gt;root&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note again what we said in the previous post: this is just a classification based on the distinguisher paradigms and there are other ways of categorizing aorist forms.&lt;/p&gt;
&lt;p&gt;Does this cover all aorist middle indicatives and infinitives in SBLGNT? Are there any words in more than one class or with more than one stem? And what are the counts for the different classes and dominant lemmas or forms within each class?&lt;/p&gt;
&lt;p&gt;We&#39;ll cover that here.&lt;/p&gt;
&lt;p&gt;There is one form that doesn&#39;t match our distinguisher patterns and that&#39;s ξυρᾶσθαι in 1 Cor 11.6. This seems to be an error in the MorphGNT tagging, though. It&#39;s clearly a present (imperfective) infinitive not an aorist (perfective) infinitive and so is not relevant here.&lt;/p&gt;
&lt;p&gt;Now in terms of multiple stems, we do have an augmentation difference with ἐργάζομαι. We find both ἠργασ- and εἰργασ-.&lt;/p&gt;
&lt;p&gt;And in terms of words that seem to appear in more than one class, we have these forms of ἐξαιρέω:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἐξελέσθαι, which is clearly thematic&lt;/li&gt;
&lt;li&gt;ἐξειλάμην and ἐξείλατο, which are clearly alphathematic&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We also have these forms of ἀποδίδωμι (which we&#39;ve brought up before):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἀπέδετο&lt;/li&gt;
&lt;li&gt;ἀπέδοσθε&lt;/li&gt;
&lt;li&gt;ἀπέδοντο&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We would &lt;em&gt;expect&lt;/em&gt; the forms to follow the root pattern. ἀπέδοσθε is unambiguously root, ἀπέδετο is unambiguously thematic. ἀπέδοντο could be taken to be root or thematic. For the purposes of the counts below, we&#39;ll take ἀπέδοντο to be root.&lt;/p&gt;
&lt;p&gt;Note that ἐκδίδωμι only appears as ἐξέδετο which is also thematic so there&#39;s definitely some reanalysis going on with the δίδωμι compounds.&lt;/p&gt;
&lt;p&gt;Here are the total counts across classes for aorist middles in SBLGNT:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;class&lt;/th&gt;
&lt;th&gt;# lemmas&lt;/th&gt;
&lt;th&gt;# tokens&lt;/th&gt;
&lt;th&gt;# hapakes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;alphathematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;109&lt;/td&gt;
&lt;td&gt;393&lt;/td&gt;
&lt;td&gt;49&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;thematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;320&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;root&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;39&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;(yes, &#34;hapakes&#34; is a running joke, equivalent to calling them &#34;the onces&#34;)&lt;/p&gt;
&lt;p&gt;ἀπο-δο is the only root ending with ο and the rest of the root endings are θε and compounds:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;θε&lt;/li&gt;
&lt;li&gt;ἐκ-θε&lt;/li&gt;
&lt;li&gt;ἀπο-θε&lt;/li&gt;
&lt;li&gt;ἐπι-θε&lt;/li&gt;
&lt;li&gt;δια-θε&lt;/li&gt;
&lt;li&gt;κατα-θε&lt;/li&gt;
&lt;li&gt;παρα-θε&lt;/li&gt;
&lt;li&gt;προσ-θε&lt;/li&gt;
&lt;li&gt;προ-θε&lt;/li&gt;
&lt;li&gt;συν-θε&lt;/li&gt;
&lt;li&gt;ἀνα-θε&lt;/li&gt;
&lt;li&gt;προσ-ανα-θε&lt;/li&gt;
&lt;li&gt;συν-επι-θε&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The thematics come from ten familes and are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;γεν (γίνομαι)&lt;/li&gt;
&lt;li&gt;παρα-γεν&lt;/li&gt;
&lt;li&gt;πυθ (πυνθάνομαι)&lt;/li&gt;
&lt;li&gt;ἐξ-ελ (-αἰρέω)&lt;/li&gt;
&lt;li&gt;συμ-βαλ (-βάλλω)&lt;/li&gt;
&lt;li&gt;περι-βαλ&lt;/li&gt;
&lt;li&gt;ἀνα-βαλ&lt;/li&gt;
&lt;li&gt;ἀνα-σχ (-έχομαι)&lt;/li&gt;
&lt;li&gt;ἀπ-ολ (-όλλυμι)&lt;/li&gt;
&lt;li&gt;συν-απ-ολ&lt;/li&gt;
&lt;li&gt;ἐκ-δο (-δίδωμι)&lt;/li&gt;
&lt;li&gt;ἀπο-δο&lt;/li&gt;
&lt;li&gt;συλ-λαβ (-λαμβάνω)&lt;/li&gt;
&lt;li&gt;κατα-λαβ&lt;/li&gt;
&lt;li&gt;ἐπι-λαβ&lt;/li&gt;
&lt;li&gt;προσ-λαβ&lt;/li&gt;
&lt;li&gt;ἀντι-λαβ&lt;/li&gt;
&lt;li&gt;ἀφ-ικ (-ικνέομαι)&lt;/li&gt;
&lt;li&gt;ἐφ-ικ&lt;/li&gt;
&lt;li&gt;ἐπι-λαθ (-λανθάνομαι)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But note that γίνομαι alone makes up 269 out of the 320 tokens!&lt;/p&gt;
&lt;p&gt;Now by person/number:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;root&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;63&lt;/td&gt;
&lt;td&gt;45&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;208&lt;/td&gt;
&lt;td&gt;227&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;57&lt;/td&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Because μι verbs have root forms throughout the middle (not just in the infinitive like in the active in Hellenistic Greek) we don&#39;t get the disproportionately high &lt;strong&gt;INF&lt;/strong&gt; &lt;strong&gt;root&lt;/strong&gt; counts that we did in the active.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;3SG&lt;/strong&gt; expectedly dominates. This is particularly true in the &lt;strong&gt;thematic&lt;/strong&gt;, in large part due to ἐγένετο. But in addition to dominance of the  &lt;strong&gt;3SG&lt;/strong&gt; by ἐγένετο, all the &lt;strong&gt;2SG&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; thematic aorist middles are γεν and most of the &lt;strong&gt;INF&lt;/strong&gt;, &lt;strong&gt;1SG&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt; are too, as show here in our table showing dominant forms:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;root&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;γενέσθαι 36/45&lt;/td&gt;
&lt;td&gt;καταθέσθαι 2/3 ἀποθέσθαι 1/3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐγενόμην 12/16&lt;/td&gt;
&lt;td&gt;προεθέμην 1/3 προσανεθέμην 1/3 ἀνεθέμην 1/3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;κατηρτίσω 2/8 ἠρνήσω 2/8&lt;/td&gt;
&lt;td&gt;ἐγένου 2/2&lt;/td&gt;
&lt;td&gt;ἔθου 1/1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐγένετο 201/227&lt;/td&gt;
&lt;td&gt;ἔθετο 7/16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐγένεσθε 4/4&lt;/td&gt;
&lt;td&gt;ἔθεσθε 1/2 ἀπέδοσθε 1/2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἤρξαντο 19/57&lt;/td&gt;
&lt;td&gt;ἐγένοντο 14/26&lt;/td&gt;
&lt;td&gt;ἔθεντο 4/14&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;As with the actives, there is greater lexical variety amongst the alphathematics than amongst the thematics.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;class&lt;/th&gt;
&lt;th&gt;token-lemma ratio&lt;/th&gt;
&lt;th&gt;% hapakes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;alphathematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3.61&lt;/td&gt;
&lt;td&gt;45.0 %&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;thematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;16.00&lt;/td&gt;
&lt;td&gt;50.0 %&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;root&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.79&lt;/td&gt;
&lt;td&gt;21.4 %&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The top 5% of alphathematic lemmas make up 32.1% of the tokens whereas the top 5% of thematic lemmas make up a whopping 84.1% of tokens. For the actives, recall the numbers were 44.1% and 60.7% respectively.&lt;/p&gt;
&lt;p&gt;In the next couple of posts we&#39;ll look at the (θ)η aorists (often misleadingly called aorist &#34;passives&#34;).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Picking the Words for Greek Typing</title>
    <link href="https://jktauber.com/2020/08/01/picking-the-words-for-greektyping/" rel="alternate" type="text/html" title="Picking the Words for Greek Typing"/>
    <published>2020-08-01T16:20:00+08:00</published>
    <updated>2020-08-01T16:20:00+08:00</updated>
    <id>https://jktauber.com/2020/08/01/picking-the-words-for-greektyping</id>
    <content type="html" xml:base="https://jktauber.com/2020/08/01/picking-the-words-for-greektyping/">&lt;p&gt;Last week, we launched &lt;a href=&#34;https://greektyping.com&#34;&gt;greektyping.com&lt;/a&gt; to help people get better at typing Greek. Aurélien Berra &lt;a href=&#34;https://twitter.com/aurelberra/status/1287863228232597506&#34;&gt;asked&lt;/a&gt; what the method of choosing words to type was so I thought I&#39;d write a blog post about it.&lt;/p&gt;
&lt;h2&gt;Step 1.&lt;/h2&gt;
&lt;p&gt;I took &lt;a href=&#34;https://github.com/morphgnt/sblgnt&#34;&gt;MorphGNT SBLGNT&lt;/a&gt; and wrote a script that made a list of words from it as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;every token in the text including punctuation&lt;/li&gt;
&lt;li&gt;every token in the text with punctuation stripped&lt;/li&gt;
&lt;li&gt;every &lt;a href=&#34;https://jktauber.com/2018/07/23/normalisation-column-morphgnt/&#34;&gt;normalized token&lt;/a&gt; in the text but if it has a movable final character, add both with and without&lt;/li&gt;
&lt;li&gt;the previous but with accents stripped&lt;/li&gt;
&lt;li&gt;every lemma in the text&lt;/li&gt;
&lt;li&gt;the lemma but with accents stripped&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So up to 8 potential &#34;words&#34; from each token in the SBLGNT, but then with duplicates removed. This led to 55,496 unique &#34;words&#34;.&lt;/p&gt;
&lt;h2&gt;Step 2.&lt;/h2&gt;
&lt;p&gt;I grouped every individual Greek character (209 of them) found in the above word list into 30 &#34;chapter&#34; buckets. For example, I put &#34;κ&#34; in chapter 1 and &#34;ξ&#34; in chapter 4 and &#34;έ&#34; in chapter 8 and &#34;ἤ&#34; in chapter 14 and so on. This wasn&#39;t done computationally, just manually. Each chapter has a theme: something new that gets introduced and, other than chapter 5 which covers the uppercase letters, there are no more than 9 new characters in each chapter and usually 5–8.&lt;/p&gt;
&lt;h2&gt;Step 3.&lt;/h2&gt;
&lt;p&gt;I then wrote a script that went through all 55,496 &#34;words&#34; from Step 1 and, for each character, looked up which chapter from Step 2 that character was introduced in. Then, for each word, the script noted the earliest chapter needed for all the characters in that word.&lt;/p&gt;
&lt;p&gt;In other words, if &lt;code&gt;chapter&lt;/code&gt; is a mapping from a character to what chapter number it is in, calculate &lt;code&gt;max(chapter[character] for character in word)&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;At this point the script has built a table of 55,496 words each with the &#34;target chapter&#34; they can be introduced in.&lt;/p&gt;
&lt;h2&gt;Step 4.&lt;/h2&gt;
&lt;p&gt;When a user on &lt;a href=&#34;https://greektyping.com&#34;&gt;greektyping.com&lt;/a&gt; is doing a particular chapter, here&#39;s what happens:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;the table is queried for all the words whose target chapter is the current chapter being done.&lt;/li&gt;
&lt;li&gt;a sample of 10 is taken from the result (less than 10 if there are fewer than 10 words for a given target chapter, which happens in chapters 22, 24, 25, 26, and 28)&lt;/li&gt;
&lt;li&gt;this sample is sorted by length&lt;/li&gt;
&lt;li&gt;the user is presented with that list&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;So that&#39;s how it works. It would be fairly easy to apply to other Greek texts (they don&#39;t have to be analysed to the extent MorphGNT is). But even with just the MorphGNT there&#39;s a lot of &#34;replayability&#34;. Chapter 8 alone has 16,704 words you &lt;em&gt;could&lt;/em&gt; be tested on.&lt;/p&gt;
&lt;p&gt;We&#39;ll probably add some richer statistics at some point and also typing of longer units of text but for now our focus is on adding instructions for more keyboard layouts (the drills themselves will stay the same, though).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Last week, we launched &lt;a href=&#34;https://greektyping.com&#34;&gt;greektyping.com&lt;/a&gt; to help people get better at typing Greek. Aurélien Berra &lt;a href=&#34;https://twitter.com/aurelberra/status/1287863228232597506&#34;&gt;asked&lt;/a&gt; what the method of choosing words to type was so I thought I&#39;d write a blog post about it.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 45</title>
    <link href="https://jktauber.com/2020/07/28/a-tour-of-greek-morphology-part-45/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 45"/>
    <published>2020-07-28T23:20:00+08:00</published>
    <updated>2020-07-28T23:20:00+08:00</updated>
    <id>https://jktauber.com/2020/07/28/a-tour-of-greek-morphology-part-45</id>
    <content type="html" xml:base="https://jktauber.com/2020/07/28/a-tour-of-greek-morphology-part-45/">&lt;p&gt;Part forty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;We&#39;ve classified aorist active endings into three classes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;alphathematic (first aorists including kappa, sigmatic, and pseudo-sigmatic)&lt;/li&gt;
&lt;li&gt;thematic (second aorists)&lt;/li&gt;
&lt;li&gt;root&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&#39;s important to stress that this is a classification of distinguisher paradigms. It is related to but distinct from other ways of classifying aorists based on the properties of the stem and how it relates to the imperfective (present) stem. We&#39;ll get to those other ways in a few of posts&#39; time but for now, our classification is just based on the distinctive set of &lt;em&gt;endings&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;As we&#39;ve done before, we&#39;ll now take this classifcation and look at various counts in the SBLGNT. How many times do we encounter tokens of each class? How many different lemmas are in each class? Which paradigm cells are most common for each class?&lt;/p&gt;
&lt;p&gt;Let&#39;s start with just the lemma and token counts as well as the number of lemmas that only occur once in the SBLGNT text.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;class&lt;/th&gt;
&lt;th&gt;# lemmas&lt;/th&gt;
&lt;th&gt;# tokens&lt;/th&gt;
&lt;th&gt;# hapakes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;alphathematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;661&lt;/td&gt;
&lt;td&gt;2973&lt;/td&gt;
&lt;td&gt;326&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;thematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;103&lt;/td&gt;
&lt;td&gt;2082&lt;/td&gt;
&lt;td&gt;33&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;root&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;36&lt;/td&gt;
&lt;td&gt;262&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;There is more lexical variety in the alphathematic class, especially when compared with the thematic class. This can be seen in the token-lemma ratio and in the percentage of lemmas that are hapakes.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;class&lt;/th&gt;
&lt;th&gt;token-lemma ratio&lt;/th&gt;
&lt;th&gt;% hapakes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;alphathematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4.50&lt;/td&gt;
&lt;td&gt;49.3 %&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;thematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20.21&lt;/td&gt;
&lt;td&gt;32.0 %&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;root&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;7.28&lt;/td&gt;
&lt;td&gt;41.7 %&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Another way to see this is what % of tokens are forms of the top % of lemmas.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;5%&lt;/th&gt;
&lt;th&gt;10%&lt;/th&gt;
&lt;th&gt;25%&lt;/th&gt;
&lt;th&gt;50%&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;alphathematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;44.1%&lt;/td&gt;
&lt;td&gt;57.4%&lt;/td&gt;
&lt;td&gt;76.1%&lt;/td&gt;
&lt;td&gt;88.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;thematic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;60.7%&lt;/td&gt;
&lt;td&gt;76.8%&lt;/td&gt;
&lt;td&gt;89.6%&lt;/td&gt;
&lt;td&gt;96.4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;root&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;21.8%&lt;/td&gt;
&lt;td&gt;47.7%&lt;/td&gt;
&lt;td&gt;80.5%&lt;/td&gt;
&lt;td&gt;92.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This table is saying that the top 5% of lemmas with alphathematic forms make up 44.1% of alphathematic tokens but the top 5% of lemmas with thematic forms make up 60.7% of thematic tokens.&lt;/p&gt;
&lt;p&gt;In other words, the thematic aorist active tokens are drawn from a smaller set of lemmas than the alphathematic. In fact, a third of thematic aorist active tokens in SBLGNT are forms of εἶπον (and, as we&#39;ll see in a moment, mostly &lt;strong&gt;3SG&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;One interesting anomaly perhaps worth coming back to at some stage (I wasn&#39;t aware of it until now) is that at the top 5% and top 10% lemma level, the root aorists token % is lower than the alphathematic but at the 25% and 50% level is above.&lt;/p&gt;
&lt;p&gt;Okay, that&#39;s distribution across the three classes of ending. What about individual paradigm cell counts?&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;root&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;509&lt;/td&gt;
&lt;td&gt;351&lt;/td&gt;
&lt;td&gt;95&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;224&lt;/td&gt;
&lt;td&gt;163&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;88&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1244&lt;/td&gt;
&lt;td&gt;1143&lt;/td&gt;
&lt;td&gt;94&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;80&lt;/td&gt;
&lt;td&gt;45&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;119&lt;/td&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;709&lt;/td&gt;
&lt;td&gt;310&lt;/td&gt;
&lt;td&gt;42&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;In all cases, the infinitive and third person dominate.&lt;/p&gt;
&lt;p&gt;It is interesting that in the alphathematics, &lt;strong&gt;3SG&lt;/strong&gt; dominates with &lt;strong&gt;3PL&lt;/strong&gt; next and then &lt;strong&gt;INF&lt;/strong&gt;. In the thematics, &lt;strong&gt;3SG&lt;/strong&gt; dominates even more followed by &lt;strong&gt;INF&lt;/strong&gt; with &lt;strong&gt;3PL&lt;/strong&gt; not far behind. In the root aorists, the &lt;strong&gt;INF&lt;/strong&gt; is actually up with the &lt;strong&gt;3SG&lt;/strong&gt; with &lt;strong&gt;3PL&lt;/strong&gt; a distant third. Recall the μι verbs have a root form in the &lt;strong&gt;INF&lt;/strong&gt; but nowhere else. This likely explains why the &lt;strong&gt;INF&lt;/strong&gt; makes up such a large proportion of root form tokens.&lt;/p&gt;
&lt;p&gt;Within the 1st and 2nd person cells, the &lt;strong&gt;1SG&lt;/strong&gt; dominates in the alphathematic and especially the thematic. In the root, the &lt;strong&gt;2PL&lt;/strong&gt; is actually on par with the &lt;strong&gt;1SG&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Again this is worthy of closer inspection but there are definitely individual lexical items at work here.&lt;/p&gt;
&lt;p&gt;As we&#39;ve done before, let&#39;s look at which lemmas (if any) dominate particular cell paradigm counts.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;root&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;δοῦναι 33/95&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;εἶδον 54/163&lt;/td&gt;
&lt;td&gt;ἔγνων 6/12 ἀνέβην 3/12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;εἶδες 8/30&lt;/td&gt;
&lt;td&gt;ἔγνως 3/3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;εἶπε(ν) 610/1143&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐνέβημεν 1/3 ἐπέγνωμεν 1/3 ἐξέστημεν 1/3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐλάβετε 13/40&lt;/td&gt;
&lt;td&gt;ἀνέγνωτε 10/13&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἔγνωσαν 17/42&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Consistent with its greater lexical variety, the alphathematic cells are not dominated by any one lexical item at all.&lt;/p&gt;
&lt;p&gt;In the thematics, though, we see the disproportionate occurrence of εἶδον in the &lt;strong&gt;1SG&lt;/strong&gt; and &lt;strong&gt;2SG&lt;/strong&gt; and especially of εἶπον in the &lt;strong&gt;3SG&lt;/strong&gt; where it makes up more than half the occurrences of thematic &lt;strong&gt;3SG&lt;/strong&gt; aorist actives.&lt;/p&gt;
&lt;p&gt;Note that no root aorist lemma dominates the &lt;strong&gt;3SG&lt;/strong&gt; cell but all the other cells have a small set of lemmas covering a lot of occurrences. ἔγνως is the only root &lt;strong&gt;2SG&lt;/strong&gt; form in the SBLGNT, and ἀνέγνωτε makes up 77% of root &lt;strong&gt;2PL&lt;/strong&gt; occurences.&lt;/p&gt;
&lt;p&gt;One thing that might be slightly misleading about the lemma numbers for the thematic and (especially) root aorists is inclusion of compound verbs with preverbs. The 103 thematic aorist active lemmas actually come from 27 base verbs (there are 16 lexical items just from ἔρχομαι/ἦλθον for example). The 36 root aorist active lemmas actually come from just 7 base verbs and 3 of those (δίδωμι, τίθημι, and ἀφ-ίημι) only have a root ending in the infinitive.&lt;/p&gt;
&lt;p&gt;So the only fully root verbs in the SBLGNT are the γνω family, the βη family, the στη family, and δυ. With the exception of δυ which has only one instance, the rest have reasonable token counts (82 for γνω, 71 for βη, 55 for στη).&lt;/p&gt;
&lt;p&gt;The thematic aorist base verbs with the highest token counts are: the εἰπ family (689), the ἐλθ family (538), εἰδ (178), the λαβ family (125), the ἀγαγ family (71), the βαλ family (70), the ἀπο-θαν family (67), εὑρ (58), the πεσ family (57).&lt;/p&gt;
&lt;p&gt;Next up we&#39;ll look at the aorist middles again.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Polytonic Greek Typing Tutor</title>
    <link href="https://jktauber.com/2020/07/26/a-polytonic-greek-typing-tutor/" rel="alternate" type="text/html" title="A Polytonic Greek Typing Tutor"/>
    <published>2020-07-26T16:20:00+08:00</published>
    <updated>2020-07-26T16:20:00+08:00</updated>
    <id>https://jktauber.com/2020/07/26/a-polytonic-greek-typing-tutor</id>
    <content type="html" xml:base="https://jktauber.com/2020/07/26/a-polytonic-greek-typing-tutor/">&lt;p&gt;I&#39;ve revived an old web application to help people practice typing Ancient Greek.&lt;/p&gt;
&lt;p&gt;Being able to type Greek fluently, diacritics and all, is an often neglected skill for classical and biblical language students but it&#39;s one that is increasingly important whether you&#39;re doing searches, writing essays, editing electronic editions, or just chatting about (or even better, &lt;em&gt;in&lt;/em&gt;) Greek online.&lt;/p&gt;
&lt;p&gt;A few years ago, I wrote a simple web application to help me practice typing using the built-in &lt;strong&gt;Greek&amp;nbsp;-&amp;nbsp;Polytonic&lt;/strong&gt; input source on macOS. I grouped all the characters (including with full diacritics) into an ordered sequence with 30 stages then wrote a script to find Greek words in the New Testament that only used letters and diacritics appropriate for each stage in the sequence.&lt;/p&gt;
&lt;p&gt;Talking to a classics lecturer a couple of weeks ago, he brought up the increasing need for students to be able to type Greek, and I said: &#34;oh, I have a web app for that&#34;. But I realised it needed a bit of polish.&lt;/p&gt;
&lt;p&gt;That polish is now done (with some help from my colleague Patrick Altman) and we&#39;ve now launched&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://greektyping.com&#34;&gt;https://greektyping.com&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The instructions are currently still just for the macOS Greek&amp;nbsp;-&amp;nbsp;Polytonic input source but we&#39;ve put in place some of the framework to support different instructions depending on your operating system and keyboard layout setup. We&#39;ll add new instructions for new keyboard layouts over time.&lt;/p&gt;
&lt;p&gt;Even with missing instructions, it should mostly be possible to actual do the timed exercises with any keyboard layout as you are just assessed on the Unicode characters you are producing, not how you produced them on your particular keyboard.&lt;/p&gt;
&lt;p&gt;Hopefully, though, this will be a helpful resource to all those who want to be able to type Ancient Greek faster. And we&#39;ll continue to improve it over time, both in terms of instructions for other layouts but also some more features, interesting stats, and fun games.&lt;/p&gt;
&lt;p&gt;And there is no reason we can&#39;t extend it to other writing systems too.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve revived an old web application to help people practice typing Ancient Greek.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 44</title>
    <link href="https://jktauber.com/2020/07/06/a-tour-of-greek-morphology-part-44/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 44"/>
    <published>2020-07-06T08:20:00+08:00</published>
    <updated>2020-07-06T08:20:00+08:00</updated>
    <id>https://jktauber.com/2020/07/06/a-tour-of-greek-morphology-part-44</id>
    <content type="html" xml:base="https://jktauber.com/2020/07/06/a-tour-of-greek-morphology-part-44/">&lt;p&gt;Part forty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Let&#39;s now go through the remaining aorist actives (indicatives and infinitives) that exhibit multiple stems or multiple stems and ending classes.&lt;/p&gt;
&lt;h2&gt;Differences in just the stem&lt;/h2&gt;
&lt;h3&gt;προεφήτευσεν vs ἐπροφήτευσεν&lt;/h3&gt;
&lt;p&gt;The difference is just whether the augment honours the preverb or not.&lt;/p&gt;
&lt;p&gt;| προεφήτευσεν |    | προ-  | -ε- | -φήτευσεν |
| ἐπροφήτευσεν | ἐ- | -προ- |     | -φήτευσεν |&lt;/p&gt;
&lt;h3&gt;ἀνέῳξεν vs ἤνοιξεν vs ἠνέῳξεν&lt;/h3&gt;
&lt;p&gt;ἀνέῳξ- was the earlier aorist form and then later we find ἤνοιξ- and ἠνέῳξ-. The SBLGNT has all three in the &lt;strong&gt;3SG&lt;/strong&gt;. Like in the previous example, this is also a difference in whether the augmentation honours the preverb (ἀνέῳξ-) or not (ἤνοιξ-) but with a third form where it&#39;s effectively augmented in both places (ἠνέῳξ-).&lt;/p&gt;
&lt;p&gt;| ἀνέῳξεν | ἀν- | -έῳ- | -ξεν |
| ἤνοιξεν | ἤν- | -οι- | -ξεν |
| ἠνέῳξεν | ἠν- | -έῳ- | -ξεν |&lt;/p&gt;
&lt;h3&gt;πεῖν vs πιεῖν&lt;/h3&gt;
&lt;p&gt;In the aorist active infinitive of πίνω/ἔπῐον can exhibit a &#34;Hellenistic contraction&#34;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;-ιει- /-ǐī-/ → -ει- /-ī-/.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;ἔγημα vs ἐγάμησε&lt;/h3&gt;
&lt;p&gt;ἔγημα is the earlier form and ἐγάμησα developed later later, presumably by analogy with other -εω verbs (which we&#39;ll explore later).&lt;/p&gt;
&lt;h3&gt;παρήγγειλε vs παρήγγελλε&lt;/h3&gt;
&lt;p&gt;I think this is a mistake in MorphGNT SBLGNT and should be tagged imperfect (in Luke 8.29).&lt;/p&gt;
&lt;h3&gt;κατέλειπε vs κατέλιπε&lt;/h3&gt;
&lt;p&gt;I think this is a mistake in MorphGNT SBLGNT and should be tagged imperfect (in Lk 10.40).&lt;/p&gt;
&lt;h2&gt;Differences in stem and class (non-μι verbs)&lt;/h2&gt;
&lt;h3&gt;παρέλαβον vs παρελάβοσαν&lt;/h3&gt;
&lt;p&gt;παραλαμβάνω has a pretty standard thematic aorist but we find the &lt;strong&gt;3PL&lt;/strong&gt; form παρελάβοσαν alongside the expected παρέλαβον.&lt;/p&gt;
&lt;h3&gt;ἀνέκραξαν vs ἀνέκραγον&lt;/h3&gt;
&lt;p&gt;The aorist of ἀνακράζω was originally thematic ἀν-έ-κραγ-ο- but started to develop a sigmatic variant ἀν-έ-κραγ-σ-.&lt;/p&gt;
&lt;p&gt;In the SBLGNT we mostly find the later sigmatic variant but the &lt;strong&gt;3PL&lt;/strong&gt; also appears in the original thematic form (ἀνέκραγον alongside ἀνέκραξαν).&lt;/p&gt;
&lt;h3&gt;ἤγαγον vs ἦξα (and compounds)&lt;/h3&gt;
&lt;p&gt;In the SBLGNT, we find συνήγαγον vs συνῆξα and ἐπισυναγαγεῖν vs ἐπισυνάξαι.&lt;/p&gt;
&lt;p&gt;In other words, the stem ἀγ-αγ-ο vs αγ-σ.&lt;/p&gt;
&lt;h2&gt;Differences in stem and class (μι verbs other than ἵστημι and compounds)&lt;/h2&gt;
&lt;p&gt;Recall that in the Hellenistic period, the &lt;strong&gt;INF&lt;/strong&gt; was still mostly a root aorist but most of the other forms were kappa alphathematics (not just in the singular, as was the case classically, but in the plural too through levelling). Occasionally a thematic form creeps in though.&lt;/p&gt;
&lt;h3&gt;τίθημι and compounds&lt;/h3&gt;
&lt;p&gt;All root in the &lt;strong&gt;INF&lt;/strong&gt; and kappa alphathematics elsewhere.&lt;/p&gt;
&lt;h3&gt;δίδωμι and compounds&lt;/h3&gt;
&lt;p&gt;As expected except for the &lt;strong&gt;3PL&lt;/strong&gt; παρέδοσαν (i.e. the classical form) alongside παρέδωκαν.&lt;/p&gt;
&lt;h3&gt;ἀφίημι&lt;/h3&gt;
&lt;p&gt;As expected except for the &lt;strong&gt;2SG&lt;/strong&gt; ἀφῆκες where we&#39;d expect ἀφῆκας.&lt;/p&gt;
&lt;h2&gt;ἵστημι and compounds&lt;/h2&gt;
&lt;p&gt;Unlike the other verbs with μι presents, ἵστημι has no kappa alphathematics. In fact, even classically, the entire aorist paradigm had a full set of root aorist forms alongside a full set of sigmatic alphathematic aorists.&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | στῆ-ναι   | στῆσ-αι      |
| &lt;strong&gt;1SG&lt;/strong&gt; | ἔ-στη-ν   | ἔ-στησ-α     |
| &lt;strong&gt;2SG&lt;/strong&gt; | ἔ-στη-ς   | ἔ-στησ-α-ς   |
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔ-στη     | ἔ-στησ-ε     |
| &lt;strong&gt;1PL&lt;/strong&gt; | ἔ-στη-μεν | ἐ-στήσ-α-μεν |
| &lt;strong&gt;2PL&lt;/strong&gt; | ἔ-στη-τε  | ἐ-στήσ-α-τε  |
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔ-στη-σαν | ἔ-στησ-α-ν   |&lt;/p&gt;
&lt;p&gt;This is true of the compounds too.&lt;/p&gt;
&lt;p&gt;We&#39;ll later take up the topic of the different usage between the two sets. But for now I just want to highlight that, unlike most of the other examples in this post or the previous one, this is not an example of a shift between aorist classes happening before our eyes but something more ingrained in the earlier history of Greek. We&#39;ll definitely return to it, but there are other matters to cover first.&lt;/p&gt;
&lt;p&gt;We&#39;ve now allocated all our aorist active infinitive and indicative forms to inflectional (or at least ending) classes and in the next post we&#39;ll look at some counts.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 43</title>
    <link href="https://jktauber.com/2020/06/29/a-tour-of-greek-morphology-part-43/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 43"/>
    <published>2020-06-29T21:00:00+08:00</published>
    <updated>2020-06-29T21:00:00+08:00</updated>
    <id>https://jktauber.com/2020/06/29/a-tour-of-greek-morphology-part-43</id>
    <content type="html" xml:base="https://jktauber.com/2020/06/29/a-tour-of-greek-morphology-part-43/">&lt;p&gt;Part forty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Before we get to counts in the various aorist classes, we need to dive a little more into the verbs that appear to be in more than one class.&lt;/p&gt;
&lt;p&gt;We&#39;ve already seen the kappa aorists like ἔδωκα and ἔθηκα that, in the infinitive (and, classically, in the plural), are root aorists but elsewhere have alphathematic endings and a slightly different stem.&lt;/p&gt;
&lt;p&gt;In this post we&#39;re going to look at the aorist active verbs in the SBLGNT that have a consistent stem throughout but exhibit both thematic (2nd aorist) and alphathematic (1st aorist) variants. In other words, for some cells in the paradigm there is a form that follows the thematic distinguisher pattern, for some cells there is a form that follows the alphathematic distinguisher pattern, and in some cells we find both forms. In theory, both forms might be possible in any cell, but we&#39;re just using a small corpus so in practice the paradigms will be sparse.&lt;/p&gt;
&lt;p&gt;In all cases, the thematic aorist is the older form and the alphathematic form developed later (particularly during the Hellenistic period) as part of a general movement towards having fewer classes of aorist.&lt;/p&gt;
&lt;p&gt;Note that the &lt;strong&gt;3SG&lt;/strong&gt; ending -ε(ν) is ambiguous as to which class the form is in (between these two classes).&lt;/p&gt;
&lt;p&gt;I should also note that the stem and its relationship to the imperfective stem &lt;em&gt;can&lt;/em&gt; be used as a diagnostic for aorist class. But we are ignoring that for now and just focusing on the classes of &lt;em&gt;ending&lt;/em&gt; (or more precisely, the distinguishers).&lt;/p&gt;
&lt;p&gt;The relevant verbs are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἔρχομαι/ἦλθον and compounds&lt;/li&gt;
&lt;li&gt;λέγω/εἶπον and compounds&lt;/li&gt;
&lt;li&gt;φέρω/ἤνεγκα compounds&lt;/li&gt;
&lt;li&gt;πίπτω/ἔπεσα and compounds&lt;/li&gt;
&lt;li&gt;βάλλω/ἔβαλον and compounds&lt;/li&gt;
&lt;li&gt;εὑρίσκω/εὗρον&lt;/li&gt;
&lt;li&gt;ὁράω/εἶδον&lt;/li&gt;
&lt;li&gt;ἀναιρέω/ἀνεῖλον&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;ἔρχομαι/ἦλθον and compounds&lt;/h3&gt;
&lt;p&gt;The alphathematic variants seem more likely in the plural (although we&#39;ll defer any actual statistics for now).&lt;/p&gt;
&lt;p&gt;Note these could not be reanalyzed as sigmatic or pseudo-sigmatic.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐλθεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἦλθον&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἦλθες&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἦλθε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἤλθομεν&lt;/td&gt;
&lt;td&gt;ἤλθαμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἤλθατε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἦλθον&lt;/td&gt;
&lt;td&gt;ἦλθαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀπελθεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀπῆλθον&lt;/td&gt;
&lt;td&gt;ἀπῆλθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀπῆλθε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀπῆλθον&lt;/td&gt;
&lt;td&gt;ἀπῆλθαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἰσελθεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἰσῆλθον&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἰσῆλθες&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἰσῆλθε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἰσήλθομεν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;εἰσήλθατε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἰσῆλθον&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐξελθεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐξῆλθον&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐξῆλθες&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐξῆλθε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐξήλθομεν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐξήλθατε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐξῆλθον&lt;/td&gt;
&lt;td&gt;ἐξῆλθαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;προσῆλθε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;προσῆλθον&lt;/td&gt;
&lt;td&gt;προσῆλθαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;συνελθεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;συνῆλθε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;συνῆλθον&lt;/td&gt;
&lt;td&gt;συνῆλθαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;λέγω/εἶπον and compounds&lt;/h3&gt;
&lt;p&gt;Note these could not be reanalyzed as sigmatic or pseudo-sigmatic.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἰπεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶπον&lt;/td&gt;
&lt;td&gt;εἶπα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶπες&lt;/td&gt;
&lt;td&gt;εἶπας&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶπε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;εἴπατε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶπον&lt;/td&gt;
&lt;td&gt;εἶπαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;προεῖπον&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;προεῖπε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;προείπαμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;φέρω/ἤνεγκα compounds&lt;/h3&gt;
&lt;p&gt;Note the stem ends in a kappa and so it resembles a kappa aorist when alphathematic. It is therefore particularly interesting that the indicatives are &lt;em&gt;all&lt;/em&gt; alphathematic (or in the case of the 3SG, could be taken as in that class).&lt;/p&gt;
&lt;p&gt;In other words, the existence of the kappa may have made speakers feel a little more comfortable using the alpha endings.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀνενεγκεῖν&lt;/td&gt;
&lt;td&gt;ἀνενέγκαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀνήνεγκε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀπενεγκεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀπήνεγκε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἀπήνεγκαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἰσενεγκεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;εἰσηνέγκαμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ὑπενεγκεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ὑπήνεγκα&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;πίπτω/ἔπεσα and compounds&lt;/h3&gt;
&lt;p&gt;Note the stem ends in a sigma and so it resembles a sigmatic aorist when alphathematic. As with ἤνεγκα, it is therefore interesting that the indicatives are all alphathematic (or in the case of the 3SG, could be taken as in that class).&lt;/p&gt;
&lt;p&gt;In other words, the existence of the sigma may have made speakers feel a little more comfortable using the alpha endings.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;πεσεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἔπεσα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔπεσε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἔπεσαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀναπεσεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀνέπεσε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἀνέπεσαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐκπεσεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐξέπεσε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐξεπέσατε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐξέπεσαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐπέπεσε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἐπέπεσαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;προσέπεσε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;προσέπεσαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;βάλλω/ἔβαλον and compounds&lt;/h3&gt;
&lt;p&gt;Notice that, as often has been the case before, the &lt;strong&gt;3PL&lt;/strong&gt; appears in both classes. In a future post we&#39;ll run some numbers as it could just be that the &lt;strong&gt;3PL&lt;/strong&gt; is simply more common in general.&lt;/p&gt;
&lt;p&gt;The stem here ends in a resonant, so the alphathematics look a little more like pseudo-sigmatics.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;βαλεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔβαλε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔβαλον&lt;/td&gt;
&lt;td&gt;ἔβαλαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐπιβαλεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐπέβαλε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἐπέβαλον&lt;/td&gt;
&lt;td&gt;ἐπέβαλαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;εὑρίσκω/εὗρον&lt;/h3&gt;
&lt;p&gt;The stem here ends in a resonant, so the alphathematics look a little more like pseudo-sigmatics.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εὑρεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εὗρον&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εὗρες&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εὗρε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εὕρομεν&lt;/td&gt;
&lt;td&gt;εὕραμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εὗρον&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;ὁράω/εἶδον&lt;/h3&gt;
&lt;p&gt;Note that, like λέγω/εἶπον, these could not be reanalyzed as sigmatic or pseudo-sigmatic.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἰδεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶδον&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶδες&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶδε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἴδομεν&lt;/td&gt;
&lt;td&gt;εἴδαμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἴδετε&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶδον&lt;/td&gt;
&lt;td&gt;εἶδαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;ἀναιρέω/ἀνεῖλον&lt;/h3&gt;
&lt;p&gt;The stem here ends in a resonant, so the alphathematics look a little more like pseudo-sigmatics.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;thematic&lt;/th&gt;
&lt;th&gt;alphathematic&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀνελεῖν&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀνεῖλες&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἀνεῖλε(ν)&lt;/td&gt;
&lt;td&gt;←&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἀνείλατε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;ἀνεῖλαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;In the next post, we&#39;ll cover other aorist active verbs that have some variant forms. Then we&#39;ll be in a position to do some counts.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Lemmatization for the Morphological Lexicon</title>
    <link href="https://jktauber.com/2020/06/15/lemmatization-for-the-morphological-lexicon/" rel="alternate" type="text/html" title="Lemmatization for the Morphological Lexicon"/>
    <published>2020-06-15T08:42:00-04:00</published>
    <updated>2020-06-15T08:42:00-04:00</updated>
    <id>https://jktauber.com/2020/06/15/lemmatization-for-the-morphological-lexicon</id>
    <content type="html" xml:base="https://jktauber.com/2020/06/15/lemmatization-for-the-morphological-lexicon/">&lt;p&gt;As I slowly expand my plans for a &lt;em&gt;Morphological Lexicon of New Testament Greek&lt;/em&gt; to a &lt;em&gt;Morphological Lexicon of Ancient Greek&lt;/em&gt;, I&#39;m dealing with extra challenges in lemmatization.&lt;/p&gt;
&lt;p&gt;One of the things I&#39;m doing to verify my work is take existing morphologically-tagged and lemmatized Greek texts and see if my code and (more importantly) data generates the same form given the lemma and morphosyntactic properties. In particular, I&#39;m currently doing this with noun forms in &lt;a href=&#34;https://github.com/vgorman1/Greek-Dependency-Trees&#34;&gt;Vanessa Gorman&#39;s Greek Dependency Trees&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Along the way, I&#39;m having to amend a number of Gorman&#39;s lemmas. Not because they are wrong per se, but because they are serving a different purpose than what I need. This is not a problem unique to the Gorman trees and I gave an entire &lt;a href=&#34;https://vimeo.com/243936959&#34;&gt;talk at SBL 2017&lt;/a&gt; about related issues.&lt;/p&gt;
&lt;p&gt;As I said there, a lemma provides a link between a token in a text and an entry in a lexical resource. It acts as a key by which to retrieve a record in a lexical database (traditionally a print dictionary).&lt;/p&gt;
&lt;p&gt;There are two problems with this, however:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;you may wish to link to multiple independent lexical resources and the entries in each may not have a one-to-one mapping&lt;/li&gt;
&lt;li&gt;data within the lexical entry may not be valid for all tokens linking to that lexical entry and if the lemma only identifies the lexical entry as a whole, there&#39;s no way to discern which specific properties apply in each case.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I give many examples in my SBL 2017 talk. One obvious example of a problem is words with multiple senses. Sometimes texts will be tagged with more sense information, λέγω3 rather than just λέγω for example. This typically assumes a single canonical lexical resource (like the LSJ for Greek).&lt;/p&gt;
&lt;p&gt;But the problem exists with other information attached to a lexical entry. Notably relevant in my case is morphological information like stems or inflectional classes.&lt;/p&gt;
&lt;p&gt;Now if your goal in lemmatizing a text is to link back to an entry in LSJ (or maybe a particular sense) that&#39;s fine but it is not precise enough a reference to hang morphological information off. And this is why I&#39;m having to augment the lemmatization in annotated texts like Vanessa Gorman&#39;s (and later the Diorsis corpus).&lt;/p&gt;
&lt;p&gt;Much of this has to do with dialect variation. For this reason I didn&#39;t discuss many examples in my SBL 2017 (which was primarily focused on the New Testament). I did have an example of spelling variation, though, which is similar.&lt;/p&gt;
&lt;p&gt;The example I gave there was ἀνάπειρος versus ἀνάπηρος. And, as I said at the time, you may not care to distinguish these if you&#39;re doing lexical semantics but if you&#39;re a textual critic or phonologist, you might. And I made the following point which is particularly relevant here:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;why should ἀναπείρους be lemmatized ἀνάπηρος?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Now again, it seems innocuous if your goal in lemmatizing is just to link to a canonical dictionary. But if you&#39;re doing any kind of morphological modelling, then ἀνα-πείρ-ο- and ἀνά-πηρ-ο- are &lt;em&gt;different&lt;/em&gt; stems. Because they are different stems, there are different objects needed to hang the &#34;stem&#34; property off and you want the token &#34;ἀναπείρους&#34; to point to the object with stem = ἀνα-πείρ-ο- not the object with stem = ἀνά-πηρ-ο-.&lt;/p&gt;
&lt;p&gt;My SBL 2017 talk briefly listed a few other examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἀναλόω ~ ἀναλίσκω&lt;/li&gt;
&lt;li&gt;ἀποκτείνω ~ ἀποκτέννω&lt;/li&gt;
&lt;li&gt;ἑλκύω ~ ἕλκω&lt;/li&gt;
&lt;li&gt;ἵστημι ~ ἱστάνω&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;as well as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἀλλάσσω ~ ἀλλάττω&lt;/li&gt;
&lt;li&gt;ἁρμόττω ~ ἁρμόζω&lt;/li&gt;
&lt;li&gt;κλαίω ~ κλάω&lt;/li&gt;
&lt;li&gt;αὐξάνω ~ αὔξω&lt;/li&gt;
&lt;li&gt;μείγνυμι ~ μίγνυμι&lt;/li&gt;
&lt;li&gt;οἶμαι ~ οἴομαι&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As I tried to emphasize in my talk, choosing to lump (or split) these isn&#39;t wrong per se. But for my specific morphological purposes, and to the extent that there is a difference in any morphological property, whether it be the stem or the distinguisher paradigm or the inflectional class, or whatever, there needs to be a separate object to point to.&lt;/p&gt;
&lt;p&gt;For this reason I&#39;m adding distinct lemmas for each dialect. So far that has meant changing the lemmas for about 5% of the forms in the Gorman trees (note that&#39;s unique forms, not tokens). And so μέλιτταν gets &#34;lemmatized&#34; by me as μέλιττα not as μέλισσα, μέγαθος gets lemmatized as μέγαθος not μέγεθος, μαντηίῳ gets lemmatized as μαντήιον not μαντεῖον, and so on.&lt;/p&gt;
&lt;p&gt;That is not to do away with the lumping all together. I can collect variations together into groups and link the &lt;em&gt;group&lt;/em&gt; to, say, the LSJ entry. This is then entirely appropriate to use for properties that are shared across dialect / spelling variations.&lt;/p&gt;
&lt;p&gt;This is a key point about the lattice approach described in my SBL 2017 point. You have an object to point to where you need to specific AND an object to point to where you can be general.&lt;/p&gt;
&lt;p&gt;Furthermore, sometimes inflected forms can be ambiguous as to their &#34;narrow&#34; lemma. A ᾱ~η alternation between dialects in an ending, for example, will be neutralized in endings with a short ᾰ. And so even in the morphologically-focused case, there is sometimes a need for lumping across dialects.&lt;/p&gt;
&lt;p&gt;It&#39;s not just a matter of dialects and spelling variation: suppletion and heteroclisis also comes into play here and benefit from this approach.&lt;/p&gt;
&lt;p&gt;This is all extra work but I think it&#39;s necessary for a more precise, corpus-based language description.&lt;/p&gt;
&lt;p&gt;I want to finish working through the Gorman nouns before I share some of this data but that should happen in the coming months. And I want to emphasize that, in most cases, I&#39;m not actually &lt;em&gt;changing&lt;/em&gt; the lemmatization, I&#39;m just adding to it (although I am finding the occasional error whose correction I will send upstream).&lt;/p&gt;
&lt;p&gt;It&#39;s still early days and one thing I haven&#39;t settled on is good terminology. I&#39;m inclined to go with the lexeme ~ flexeme distinction. But then do I call the key for the latter the &#34;flemma&#34;?&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">As I slowly expand my plans for a &lt;em&gt;Morphological Lexicon of New Testament Greek&lt;/em&gt; to a &lt;em&gt;Morphological Lexicon of Ancient Greek&lt;/em&gt;, I&#39;m dealing with extra challenges in lemmatization.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 42</title>
    <link href="https://jktauber.com/2020/05/10/a-tour-of-greek-morphology-part-42/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 42"/>
    <published>2020-05-10T09:00:00+08:00</published>
    <updated>2020-05-10T09:00:00+08:00</updated>
    <id>https://jktauber.com/2020/05/10/a-tour-of-greek-morphology-part-42</id>
    <content type="html" xml:base="https://jktauber.com/2020/05/10/a-tour-of-greek-morphology-part-42/">&lt;p&gt;Part forty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;We now turn to the middle aorist endings.&lt;/p&gt;
&lt;p&gt;Recall that the imperfect middle endings were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;-μην&lt;/li&gt;
&lt;li&gt;-σο (often with loss of sigma and subsequent contraction)&lt;/li&gt;
&lt;li&gt;-το&lt;/li&gt;
&lt;li&gt;-μεθα&lt;/li&gt;
&lt;li&gt;-σθε&lt;/li&gt;
&lt;li&gt;-ντο&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Adding -σθαι for the infinitive, we unsurpisingly get the following distinguishers for the middles for alpha and thematic aorists:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | Xασθαι  | Xέσθαι |
| &lt;strong&gt;1SG&lt;/strong&gt; | Xάμην   | Xόμην  |
| &lt;strong&gt;2SG&lt;/strong&gt; | Xω      | Xου    |
| &lt;strong&gt;3SG&lt;/strong&gt; | Xατο    | Xετο   |
| &lt;strong&gt;1PL&lt;/strong&gt; | Xάμεθα  | Xόμεθα |
| &lt;strong&gt;2PL&lt;/strong&gt; | Xασθε   | Xεσθε  |
| &lt;strong&gt;3PL&lt;/strong&gt; | Xαντο   | Xοντο  |&lt;/p&gt;
&lt;p&gt;Notice that in the &lt;strong&gt;2SG&lt;/strong&gt;, ασο &amp;gt; αο &amp;gt; ω and εσο &amp;gt; εο &amp;gt; ου (although not all dialects do this).&lt;/p&gt;
&lt;p&gt;For reasons we may touch on later, the root aorists don&#39;t generally appear in the middle but δίδωμι, τίθημι, and ἵημι (with stems δο-, θε-, and ἑ- respectively) have aorist middle forms that essentially act like root aorists (just as the aorist active plurals do in Classical Greek):&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | δόσθαι   | θέσθαι   | ἕσθαι  |
| &lt;strong&gt;1SG&lt;/strong&gt; | ἐδόμην   | ἐθέμην   | εἵμην  |
| &lt;strong&gt;2SG&lt;/strong&gt; | ἔδου     | ἔθου     | εἷσο   |
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔδοτο    | ἔθετο    | εἷτο   |
| &lt;strong&gt;1PL&lt;/strong&gt; | ἐδόμεθα  | ἐθέμεθα  | εἵμεθα |
| &lt;strong&gt;2PL&lt;/strong&gt; | ἔδοσθε   | ἔθεσθε   | εἷσθε  |
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔδοντο   | ἔθεντο   | εἷντο  |&lt;/p&gt;
&lt;p&gt;Again notice in the &lt;strong&gt;2SG&lt;/strong&gt; we get a loss of sigma in the case of δίδωμι and τίθημι and οσο &amp;gt; οο &amp;gt; ου and εσο &amp;gt; εο &amp;gt; ου although this time the contraction is with the root vowel, not a (alpha-)thematic vowel. Presumably εἷσο resists sigma loss and contraction because it&#39;s disyllabic.&lt;/p&gt;
&lt;p&gt;The ambiguities are straightforward to deal with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in the &lt;strong&gt;3SG&lt;/strong&gt;, &lt;strong&gt;2PL&lt;/strong&gt;, &lt;strong&gt;INF&lt;/strong&gt;, there is an ambiguity between the thematic and τίθημι&lt;/li&gt;
&lt;li&gt;in the &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, &lt;strong&gt;3PL&lt;/strong&gt;, there is an ambiguity between the thematic and δίδωμι&lt;/li&gt;
&lt;li&gt;in the &lt;strong&gt;2SG&lt;/strong&gt;, there is an ambiguity between the thematic, τίθημι, and δίδωμι&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As we&#39;ve seen before, this all comes down to whether the ε or ο is from the root vowel or theme vowel.&lt;/p&gt;
&lt;p&gt;In the next couple of posts, we&#39;ll look at the frequency distributions of the aorist classes. We&#39;ll then start to explore in more detail the relationship between the perfective (aorist) and imperfective (present and imperfect) stems.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Tool Updates</title>
    <link href="https://jktauber.com/2020/03/19/tool-update/" rel="alternate" type="text/html" title="Tool Updates"/>
    <published>2020-03-19T15:42:00-04:00</published>
    <updated>2020-03-19T15:42:00-04:00</updated>
    <id>https://jktauber.com/2020/03/19/tool-update</id>
    <content type="html" xml:base="https://jktauber.com/2020/03/19/tool-update/">&lt;p&gt;I have made a minor update to &lt;code&gt;greek-normalisation&lt;/code&gt;, a more significant update to &lt;code&gt;vocabulary-tools&lt;/code&gt;, and have started a new project &lt;code&gt;postag-convert&lt;/code&gt; for converting between various morphosyntactic tagging schemes.&lt;/p&gt;
&lt;h2&gt;greek-normalisation&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/greek-normalisation&#34;&gt;https://github.com/jtauber/greek-normalisation&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;utils&lt;/code&gt; module in this package previously had a function for converting U+02BC and U+1FBF to U+2019 but now (in the 0.4 release) additionally provides it as a shell command.&lt;/p&gt;
&lt;p&gt;Once the package is installed, you can type &lt;code&gt;to2019 in &amp;gt; out&lt;/code&gt; in the shell and the file named &lt;code&gt;in&lt;/code&gt; will be converted to a file named &lt;code&gt;out&lt;/code&gt; with all the U+02BC and U+1FBF characters changed to U+2019.&lt;/p&gt;
&lt;h2&gt;vocabulary-tools&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/vocabulary-tools&#34;&gt;https://github.com/jtauber/vocabulary-tools&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I &lt;a href=&#34;https://jktauber.com/2020/02/14/vanessa-gormans-lemmatisation-now-in-vocabularytools/&#34;&gt;previously mentioned&lt;/a&gt; I&#39;d incorporated lemma counts from Vanessa Gorman&#39;s treebanks into &lt;code&gt;vocabulary-tools&lt;/code&gt;. I didn&#39;t check the Unicode normalisation first, though, and it turns out it was inconsistent (which led to bad numbers). That&#39;s now been fixed and the data converted to NFC.&lt;/p&gt;
&lt;p&gt;I&#39;ve also added the Diorisis lemma counts too and cleaned up the code to share more between the two data sets.&lt;/p&gt;
&lt;p&gt;Thirdly, I took a pass at finding the intersection between the passages covered by Gorman and Diorisis and generated separate lemma counts for each version of the intersection. I&#39;ll write a dedicated blog post about this later but basically I&#39;m trying to track down systemic problems with various lemmatisations and having identical texts to compare (to make sure discrepencies aren&#39;t just subcorpus bias) is very helpful.&lt;/p&gt;
&lt;p&gt;Fourthly, I&#39;ve implemented a calculation of log rank differences between lemmas in two subcorpora—in other words, a measure of how far the rank of a particular lemma differs in two subcorpora. This has (at least) two applications: one is to find which lemmas are disproportionally more common in one text versus another. For example, in the Gorman texts from Thucydides and Xenophon, the lemma Κῦρος is ranked 885th vs 30th—a log rank difference of 4.9. The lemma δράω is ranked 196th in Thucydides but 2345th in Xenophon—a log rank difference of 3.6.&lt;/p&gt;
&lt;p&gt;The second application is to compare two lemmatisations of the same subcorpus (e.g. Gorman vs Diorisis) to try to identify systemic problems. It turns out, for example, that the log rank difference of λέγω between Gorman and Diorisis is a whopping 5.974 (you&#39;d expect it to be at or near zero for the same corpus). Turns out that&#39;s because Gorman distinguishes λέγω, λέγω2, λέγω3 and Diorisis doesn&#39;t.&lt;/p&gt;
&lt;p&gt;More on this in a future post!&lt;/p&gt;
&lt;p&gt;You can see more examples of &lt;code&gt;log_rank_differences&lt;/code&gt; in action at:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/vocabulary-tools/blob/master/log_rank_differences.rst&#34;&gt;https://github.com/jtauber/vocabulary-tools/blob/master/log_rank_differences.rst&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;postag-convert&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/postag-convert&#34;&gt;https://github.com/jtauber/postag-convert&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Over the years (actually decades), various projects of mine have had to convert between different schemes for morphsyntactic tagging. Whether the original CCAT scheme, my own variants of that scheme, the Robinson scheme, the Morpheus/Perseus scheme, or its Logeion variant, I&#39;ve written code at various points to do conversions from one to another. I was well overdue writing a reusable library!&lt;/p&gt;
&lt;p&gt;I also want to be able to support Leipzig Glossing Rules and the Universal Dependencies codes as well as fully spelling out properties and values in multiple natural languages (e.g. localising terms like &#34;case&#34; or &#34;accusative&#34;).&lt;/p&gt;
&lt;p&gt;&lt;code&gt;postag-convert&lt;/code&gt; is still in the early days but is intended to eventually be useful for all of the above (and potentially reusable beyond Ancient Greek too).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I have made a minor update to &lt;code&gt;greek-normalisation&lt;/code&gt;, a more significant update to &lt;code&gt;vocabulary-tools&lt;/code&gt;, and have started a new project &lt;code&gt;postag-convert&lt;/code&gt; for converting between various morphosyntactic tagging schemes.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 41</title>
    <link href="https://jktauber.com/2020/03/17/a-tour-of-greek-morphology-part-41/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 41"/>
    <published>2020-03-17T12:00:00-05:00</published>
    <updated>2020-03-17T12:00:00-05:00</updated>
    <id>https://jktauber.com/2020/03/17/a-tour-of-greek-morphology-part-41</id>
    <content type="html" xml:base="https://jktauber.com/2020/03/17/a-tour-of-greek-morphology-part-41/">&lt;p&gt;Part forty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;https://jktauber.com/2020/01/05/a-tour-of-greek-morphology-part-39/&#34;&gt;part 39&lt;/a&gt;, we outlined the distinguisher paradigms for the sigmatic (first), thematic (second), and root aorists in the active indicative and infinitive:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | Xαι   | Xεῖν  | Xναι
| &lt;strong&gt;1SG&lt;/strong&gt; | Xα    | Xον   | Xν
| &lt;strong&gt;2SG&lt;/strong&gt; | Xας   | Xες   | Xς
| &lt;strong&gt;3SG&lt;/strong&gt; | Xε(ν) | Xε(ν) | X
| &lt;strong&gt;1PL&lt;/strong&gt; | Xαμεν | Xομεν | Xμεν
| &lt;strong&gt;2PL&lt;/strong&gt; | Xατε  | Xετε  | Xτε
| &lt;strong&gt;3PL&lt;/strong&gt; | Xαν   | Xον   | Xσαν&lt;/p&gt;
&lt;p&gt;For the sigmatic aorists, I didn&#39;t show the actual sigma because it was consistent across the paradigm (and hence not part of the &#34;distinguisher&#34;). This turned out to be a useful way to think about it for other reasons too.&lt;/p&gt;
&lt;p&gt;We&#39;ve already seen (in &lt;a href=&#34;https://jktauber.com/2020/02/05/a-tour-of-greek-morphology-part-40/&#34;&gt;part 40&lt;/a&gt;) that verbs like ἔδωκα and ἔθηκα follow the sigmatic paradigm in the singular (or in both the singular and plural in the Hellenstic period) despite not having a sigma at all.&lt;/p&gt;
&lt;p&gt;But there are other verbs that have the alpha endings too but without a sigma either because&lt;/p&gt;
&lt;p&gt;(a) the sigma sound is incorporated into the letter ξ or ψ:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | δεῖξ-αι     | γράψ-αι       |
| &lt;strong&gt;1SG&lt;/strong&gt; | ἔ-δειξ-α    | ἔ-γραψ-α      |
| &lt;strong&gt;2SG&lt;/strong&gt; | ἔ-δειξ-ας   | &lt;em&gt;ἔ-γραψ-ας&lt;/em&gt;   |
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔ-δειξ-ε(ν) | ἔ-γραψ-ε(ν)   |
| &lt;strong&gt;1PL&lt;/strong&gt; | ἐ-δείξ-αμεν | &lt;em&gt;ἔ-γραψ-αμεν&lt;/em&gt; |
| &lt;strong&gt;2PL&lt;/strong&gt; | ἐ-δείξ-ατε  | ἐ-γράψ-ατε    |
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔ-δειξ-αν   | ἔ-γραψ-αν     |&lt;/p&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;p&gt;(b) the sigma has dropped out because the previous sound is a resonant (nasal: μ, ν; or liquid: λ, ρ):&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | μεῖν-αι     | ἀπο-στεῖλ-αι     |
| &lt;strong&gt;1SG&lt;/strong&gt; | -ἐ-μειν-α   | ἀπ-έ-στειλ-α     |
| &lt;strong&gt;2SG&lt;/strong&gt; | &lt;em&gt;ἐ-μειν-ας&lt;/em&gt; | ἀπ-έ-στειλ-ας    |
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔ-μειν-ε(ν) | ἀπ-έ-στειλ-ε(ν)  |
| &lt;strong&gt;1PL&lt;/strong&gt; | ἐ-μείν-αμεν | ἀπ-ε-στείλ-αμεν  |
| &lt;strong&gt;2PL&lt;/strong&gt; | -ἐ-μείν-ατε | &lt;em&gt;ἀπ-ε-στείλ-ατε&lt;/em&gt; |
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔ-μειν-αν   | ἀπ-έ-στειλ-αν    |&lt;/p&gt;
&lt;p&gt;(forms not in SBLGNT in &lt;em&gt;italics&lt;/em&gt;)&lt;/p&gt;
&lt;p&gt;We&#39;ll discuss this in detail in another post but the loss of sigma in (b) is accompanied by a lengthening of the vowel before the resonant. Hence, for example, ἔμεινα compared with present μένω. These aorists are sometimes called &lt;em&gt;pseudo-sigmatic&lt;/em&gt; aorists.&lt;/p&gt;
&lt;p&gt;For the purposes of categorising distinguisher paradigms, (a) and (b) still just follow the alpha endings.&lt;/p&gt;
&lt;p&gt;And so there are three sets of endings:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;alpha endings (including the sigmatic, pseudo-sigmatic and kappa)&lt;/li&gt;
&lt;li&gt;thematic endings&lt;/li&gt;
&lt;li&gt;root endings&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As we shall see later, there are a few other verbs that sometimes take on alpha endings despite not even an underlying sigma. There are also verbs that mix one type of aorist and another (sometimes with a semantic distinction).&lt;/p&gt;
&lt;p&gt;We&#39;ll come back to looking at the frequency distribution of the different types of aorist but, before we do that, let&#39;s take a look at the middle aorist endings.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Tolkien, Sonnenschein and Language Learning</title>
    <link href="https://jktauber.com/2020/02/24/tolkien-sonnenschein-and-language-learning/" rel="alternate" type="text/html" title="Tolkien, Sonnenschein and Language Learning"/>
    <published>2020-02-24T03:41:21-05:00</published>
    <updated>2020-02-24T03:41:21-05:00</updated>
    <id>https://jktauber.com/2020/02/24/tolkien-sonnenschein-and-language-learning</id>
    <content type="html" xml:base="https://jktauber.com/2020/02/24/tolkien-sonnenschein-and-language-learning/">&lt;p&gt;Via an unusual route, I discovered Edward Adolf Sonnenschein and his thoughts at the turn of the 20th century on teaching Latin (and Greek).&lt;/p&gt;
&lt;p&gt;It started last week when I was looking through Oronzo Cilli&#39;s wonderful book &lt;em&gt;Tolkien&#39;s Library: An Annotated Checklist&lt;/em&gt; for entries relating to Greek. One of the books mentioned was Sonnenschein&#39;s &lt;em&gt;A Greek Grammar for Schools: Based on the Principles and Requirements of the Grammatical Society&lt;/em&gt;, marked King Edward&#39;s School (where Tolkien went) and with Tolkien&#39;s brother Hilary&#39;s name.&lt;/p&gt;
&lt;p&gt;This was clear evidence that Hilary Tolkien, and possibly John Ronald himself used Sonnenschein&#39;s grammar at King Edward&#39;s School. Sonnenschein was a classics professor in Birmingham, editor of a series of grammars (of which the Greek Grammar was one), and co-founder of the Classical Association.&lt;/p&gt;
&lt;p&gt;The grammars were published by Swan Sonnenschein, founded by his brother and which, incidentally, merged with George Allen &amp;amp; Co just before it became George Allen &amp;amp; Unwin. Two decades later, of course, Unwin published &lt;em&gt;The Hobbit&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Talking to Seumas Macdonald about the Greek grammar (and Tolkien&#39;s classics education), he mentioned he was familiar with Sonnenschein from his Latin readers.&lt;/p&gt;
&lt;p&gt;Now quite independent of this, I was looking at &lt;em&gt;The Greek War Of Independence&lt;/em&gt;, a easy Greek reader by Charles D. Chambers. We&#39;re producing a digital edition of it as part of the &lt;a href=&#34;https://greek-learner-texts.org/&#34;&gt;Greek Learner Texts Project&lt;/a&gt;. Not only was the original book published by Swan Sonnenschein but the preface begins:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This book an attempt to apply to Greek the methods
which Professor Sonnenschein has expounded in his &lt;em&gt;Ora
Maritima&lt;/em&gt; and &lt;em&gt;Pro Patria&lt;/em&gt;. The main principle is that
the systematic study of grammar should proceed side by
side with the reading of a narrative.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So here was another mention of Sonnenschein&#39;s Latin readers. I dug up an online scan of &lt;em&gt;Ora Maritima&lt;/em&gt; and discovered the following in the preface (written in 1908):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;My apology for adding another to the formidable array of elementary Latin manuals is that there is no book in existence which satisfies the requirements which I have in mind as of most importance for the fruitful study of the language by beginners. What I desiderate is:—
1. A continuous narrative from beginning to end, capable of appealing in respect of its vocabulary and subject matter to the minds and interests of young pupils, and free from all those syntactical and stylistic difficulties which make even the easiest of latin authors something of a problem.
2. A work which shall hold the true balance between too much and too little in the matter of systematic grammar. In my opinion, existing manuals are disfigured by a disproportionate amount of lifeless Accidence. The outcome of the traditional system is that the pupil learns a multitude of Latin forms (Cases, Tenses, Moods), but very little Latin.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I love the phrase &#34;disfigured by a disproportionate amount of lifeless Accidence&#34;. It reminds me of the style of Tolkien&#39;s reviews in &lt;em&gt;The Year&#39;s Work in English Studies&lt;/em&gt; in 1924. I also love &#34;the pupil learns a multitude of Latin forms...but very little Latin&#34;.&lt;/p&gt;
&lt;p&gt;As Fletcher Hardison pointed out when I shared this quote with the team working on the easy Greek readers: &#34;I think we just found our manifesto&#34;.&lt;/p&gt;
&lt;p&gt;Later in the preface, Sonnenschein writes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The pupil who has mastered this book ought to be able to read and write the easiest kind of Latin with some degree of fluency and without serious mistakes: in a word, Latin ought to have become in some degree a living language to him.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The use of &#34;no book in existence&#34; at the start makes me wonder whether this was the first real attempt at applying the Direct Method for historical languages. I wonder also if this is the first mention of Latin alongside the phrase &#34;living language&#34;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ora Martima&lt;/em&gt; also includes an earlier essay Sonnenschein wrote in 1900 entitled &lt;em&gt;New Methods in the Teaching of Latin&lt;/em&gt;. Presumably the reader is an attempt to implement the ideas in this essay.&lt;/p&gt;
&lt;p&gt;In it, Sonnenschein writes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Grammar has its proper place in any systematised method of teaching a language; but that place is not at the beginning but rather at the end of each of the steps into which a well-graduated course must be divided.&lt;/p&gt;
&lt;p&gt;...&lt;/p&gt;
&lt;p&gt;There should be no preliminary study of grammar apart from the reading of a text.&lt;/p&gt;
&lt;p&gt;...&lt;/p&gt;
&lt;p&gt;Each new grammatical feature of the language would be presented as it is wanted, in an interesting context, and would be firmly grasped by the mind; at convenient points the knowledge acquired would be summed up in a table (the declension of a noun or the forms of a tense).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is almost identical to what I outlined back in my &#34;New Kind of Graded Reader&#34; video in 2008 (100 years after &lt;em&gt;Ora Maritima&lt;/em&gt;!): &#34;we are first introduced to forms as they are used in context and then come back later to consolidate, abstract, and generalize later.&#34;&lt;/p&gt;
&lt;p&gt;The use of &#34;would&#34; suggests that Sonnenschein does not (yet) consider the idea to have been implemented in any books (hence the goal of &lt;em&gt;Ora Maritima&lt;/em&gt; eight years later).&lt;/p&gt;
&lt;p&gt;The entire essay is worth reading. It&#39;s available at the &lt;a href=&#34;https://archive.org/details/cu31924031202850/page/n15/mode/2up&#34;&gt;Internet Archive&lt;/a&gt; although I might correct the OCR and make available a proper transcription.&lt;/p&gt;
&lt;p&gt;All of this has made me interested in the history of classical language teaching at the turn of the century. What was the relationship of Sonnenschein&#39;s work to that of W. H. D. Rouse? Did Sonnenschein know &lt;a href=&#34;https://jktauber.com/2016/01/13/gouin-language-learning/&#34;&gt;Gouin&#39;s book&lt;/a&gt;? Was Tolkien exposed to the Direct Method at all?&lt;/p&gt;
&lt;p&gt;Via Seumas, I became aware of &lt;em&gt;The Living Word: W. H. D. Rouse and the Crisis of Classics in Edwardian England&lt;/em&gt; by Christopher Stray. I promptly ordered a copy as well as Stray&#39;s &lt;em&gt;The Classical Association: The First Century 1903-2003&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;What started as a tenuous connection to Tolkien&#39;s classics education has returned me to a study of the pioneers of the Direct Method applied to classical languages and given me even more inspiration to work on the Greek Learner Texts Project.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: I&#39;ve finished a first pass correcting the OCR of Sonnenschein&#39;s 1900 essay &lt;em&gt;Newer Methods in the Teaching of Latin&lt;/em&gt; (as reproduced in &lt;em&gt;Ora Maritima&lt;/em&gt;) &lt;a href=&#34;https://gist.github.com/jtauber/dead795de4223aa2f5e6652494bbadb7&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Via an unusual route, I discovered Edward Adolf Sonnenschein and his thoughts at the turn of the 20th century on teaching Latin (and Greek).</summary>
  </entry><entry>
    <title type="html">Vanessa Gorman's Lemmatisation Now in vocabulary-tools</title>
    <link href="https://jktauber.com/2020/02/13/vanessa-gormans-lemmatisation-now-in-vocabularytools/" rel="alternate" type="text/html" title="Vanessa Gorman's Lemmatisation Now in vocabulary-tools"/>
    <published>2020-02-13T21:19:29-05:00</published>
    <updated>2020-02-13T21:19:29-05:00</updated>
    <id>https://jktauber.com/2020/02/13/vanessa-gormans-lemmatisation-now-in-vocabularytools</id>
    <content type="html" xml:base="https://jktauber.com/2020/02/13/vanessa-gormans-lemmatisation-now-in-vocabularytools/">&lt;p&gt;Last year I started the Python library &lt;a href=&#34;https://github.com/jtauber/vocabulary-tools&#34;&gt;vocabulary-tools&lt;/a&gt; to consolidate the various scripts I&#39;ve written over the years to analyse vocabulary in (particularly New Testament) texts. I&#39;ve just added support for the vocabulary in Vanessa Gorman&#39;s treebanks.&lt;/p&gt;
&lt;p&gt;As part of the &lt;a href=&#34;https://jktauber.com/tag/project:greek-texts/&#34;&gt;greek-texts project&lt;/a&gt; it&#39;s important to have as many texts lemmatised as possible. For a while (for example in &lt;a href=&#34;https://vocab.perseus.org&#34;&gt;https://vocab.perseus.org&lt;/a&gt;) I used Giuseppe Celano&#39;s &lt;a href=&#34;https://github.com/gcelano/LemmatizedAncientGreekXML&#34;&gt;automated lemmatisation of Perseus&lt;/a&gt;. Recently I started &lt;a href=&#34;https://jktauber.com/2020/01/20/working-with-the-diorisis-ancient-greek-corpus/&#34;&gt;working on cleaning up the Diorisis Ancient Greek Corpus&lt;/a&gt; which is also an automated lemmatisation.&lt;/p&gt;
&lt;p&gt;Automated lemmatisation is around 90% accurate but that&#39;s quite low for doing vocabulary work, especially as the lemmatisation errors are often systematic and so can throw off stats in quite a significant way.&lt;/p&gt;
&lt;p&gt;Ultimately, we need hand-curated lemmatisation and one of the goals of the &lt;a href=&#34;https://jktauber.com/tag/project:greek-texts/&#34;&gt;greek-texts project&lt;/a&gt; is to help facilitate that. I obviously already have a lemmatisation of the Greek New Testament. There is also the Ancient Greek Dependency Treebank. But one of the most impressive efforts in this area (especially in light of the fact it is a solo effort) is Vanessa Gorman&#39;s &lt;a href=&#34;https://perseids-publications.github.io/gorman-trees/&#34;&gt;work&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There are now over 500,000 tokens of Greek prose treebanked by Professor Gorman. So not only lemmas but morphosyntactic tagging and syntactic dependency tagging as well.&lt;/p&gt;
&lt;p&gt;But for the short term, it&#39;s the lemmas I&#39;m interested in. I extracted them from the XML format produced by Arethusa and built lemma counts that could be loaded into &lt;a href=&#34;https://github.com/jtauber/vocabulary-tools&#34;&gt;vocabulary-tools&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There&#39;s still a lot of work to do on my library but I can now do things like generate an 80% vocabulary list for the Gorman corpus. Or see what words you&#39;d have to learn to read Plato&#39;s Apology if you can read Lysias&#39;s On the Murder of Eratosthenes and the New Testament at the 95% level.&lt;/p&gt;
&lt;p&gt;I also took the opportunity to add more features to &lt;a href=&#34;https://github.com/jtauber/vocabulary-tools&#34;&gt;vocabulary-tools&lt;/a&gt; including incorporating the code I wrote for the &lt;a href=&#34;https://jktauber.com/2019/11/05/subcorpus-vocabulary-statistics/&#34;&gt;Subcorpus Vocabulary Statistics&lt;/a&gt; post (which was based on the Celano lemmatisation).&lt;/p&gt;
&lt;p&gt;Ultimately I&#39;d like more post-beginner Greek prose and the LXX lemmatised. I&#39;m currently working on Plato&#39;s Crito and Epictetus&#39;s Enchiridion. If you&#39;re interested in any of this stuff, please check out the &lt;a href=&#34;https://jktauber.com/tag/project:greek-texts/&#34;&gt;greek-texts project&lt;/a&gt; and join our Slack workspace.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Last year I started the Python library &lt;a href=&#34;https://github.com/jtauber/vocabulary-tools&#34;&gt;vocabulary-tools&lt;/a&gt; to consolidate the various scripts I&#39;ve written over the years to analyse vocabulary in (particularly New Testament) texts. I&#39;ve just added support for the vocabulary in Vanessa Gorman&#39;s treebanks.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 40</title>
    <link href="https://jktauber.com/2020/02/04/a-tour-of-greek-morphology-part-40/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 40"/>
    <published>2020-02-04T22:35:16-05:00</published>
    <updated>2020-02-04T22:35:16-05:00</updated>
    <id>https://jktauber.com/2020/02/04/a-tour-of-greek-morphology-part-40</id>
    <content type="html" xml:base="https://jktauber.com/2020/02/04/a-tour-of-greek-morphology-part-40/">&lt;p&gt;Part forty of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the classical Attic dialect we find the following aorist active forms for δίδωμι and τίθημι:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | δοῦναι   | θεῖναι   |
| &lt;strong&gt;1SG&lt;/strong&gt; | ἔδωκα    | ἔθηκα    |
| &lt;strong&gt;2SG&lt;/strong&gt; | ἔδωκας   | ἔθηκας   |
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔδωκε(ν) | ἔθηκε(ν) |
| &lt;strong&gt;1PL&lt;/strong&gt; | ἔδομεν   | ἔθεμεν   |
| &lt;strong&gt;2PL&lt;/strong&gt; | ἔδοτε    | ἔθετε    |
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔδοσαν   | ἔθεσαν   |&lt;/p&gt;
&lt;p&gt;Looking at this vertically, we might split the constant part from the distinguisher as follows:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | δ-οῦναι    | θ-εῖναι   |
| &lt;strong&gt;1SG&lt;/strong&gt; | ἔ-δ-ωκα    | ἔ-θ-ηκα    |
| &lt;strong&gt;2SG&lt;/strong&gt; | ἔ-δ-ωκας   | ἔ-θ-ηκας   |
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔ-δ-ωκε(ν) | ἔ-θ-ηκε(ν) |
| &lt;strong&gt;1PL&lt;/strong&gt; | ἔ-δ-ομεν   | ἔ-θ-εμεν   |
| &lt;strong&gt;2PL&lt;/strong&gt; | ἔ-δ-οτε    | ἔ-θ-ετε    |
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔ-δ-οσαν   | ἔ-θ-εσαν   |&lt;/p&gt;
&lt;p&gt;but looking horizontally, we might split it this way:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | δοῦ-ναι    | θεῖ-ναι    |
| &lt;strong&gt;1SG&lt;/strong&gt; | ἔ-δω-κα    | ἔ-θη-κα    |
| &lt;strong&gt;2SG&lt;/strong&gt; | ἔ-δω-κας   | ἔ-θη-κας   |
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔ-δω-κε(ν) | ἔ-θη-κε(ν) |
| &lt;strong&gt;1PL&lt;/strong&gt; | ἔ-δο-μεν   | ἔ-θε-μεν   |
| &lt;strong&gt;2PL&lt;/strong&gt; | ἔ-δο-τε    | ἔ-θε-τε    |
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔ-δο-σαν   | ἔ-θε-σαν   |&lt;/p&gt;
&lt;p&gt;It looks like the singular forms share δω or θη and the plural forms share δο or θε.&lt;/p&gt;
&lt;p&gt;The plurals seem to be inflecting like root aorists with δο and θε as the root. The infinitives are consistent with this too (with an -εναι ending).&lt;/p&gt;
&lt;p&gt;However, the singular forms with the lengthened grade vowel seem to inflect with the alpha like we saw with the sigmatic aorists except we have kappa not a sigma.&lt;/p&gt;
&lt;p&gt;And so we have:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;singular&lt;/strong&gt;   | ἐ | lengthened grade (δω/θη) | κ | alphathematic endings -α -ας -ε |
| &lt;strong&gt;plural&lt;/strong&gt;     | ἐ | full grade (δο/θε)       |   | root endings -μεν -τε -σαν      |
| &lt;strong&gt;infinitive&lt;/strong&gt; |   | full grade (δο/θε)       |   | root ending -εναι               |&lt;/p&gt;
&lt;p&gt;These will sometimes be referred to as &lt;strong&gt;kappa aorists&lt;/strong&gt; even though the kappa is only found in the singular and the forms are otherwise consistent with root aorists.&lt;/p&gt;
&lt;p&gt;In the SBLGNT and other Hellenistic period Greek, we find these forms, though:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | δοῦναι   | θεῖναι
| &lt;strong&gt;1SG&lt;/strong&gt; | ἔδωκα    | ἔθηκα
| &lt;strong&gt;2SG&lt;/strong&gt; | ἔδωκας   | ἔθηκας
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔδωκε(ν) | ἔθηκε(ν)
| &lt;strong&gt;1PL&lt;/strong&gt; | ἐδώκαμεν | ἐθήκαμεν
| &lt;strong&gt;2PL&lt;/strong&gt; | ἐδώκατε  | ἐθήκατε
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔδωκαν   | ἔθηκαν&lt;/p&gt;
&lt;p&gt;although in Luke 1.2 we find the older ἔδοσαν not ἔδωκαν.&lt;/p&gt;
&lt;p&gt;The infinitives and singulars have stayed the same, but the plurals have changed to be consistent with the singulars. They have the lengthened vowel grade, the kappa, and the alphathematic endings.&lt;/p&gt;
&lt;p&gt;| singular   | ἐ | lengthened grade (δω/θη) | κ | alphathematic endings -α -ας -ε      |
| plural     | ἐ | lengthened grade (δω/θη) | κ | alphathematic endings -αμεν -ατε -αν |
| infinitive |   | full grade (δο/θε)       |   | root ending -εναι                    |&lt;/p&gt;
&lt;p&gt;This is an example of &lt;strong&gt;paradigm levelling&lt;/strong&gt; within the active indicatives. The contrast in number between singular and plural was being indicated not only by the personal ending but (redundantly) by the vowel grade, the existence/absence of the kappa and the existence/absence of the alpha theme vowel.&lt;/p&gt;
&lt;p&gt;Redundancy is not a bad thing—it improves comprehensibility in the face of noise—but it is still easy to see how this sort of levelling might happen. A speaker might internalise from other verbs that if the aorist active &lt;strong&gt;2SG&lt;/strong&gt; is Χς, the &lt;strong&gt;2PL&lt;/strong&gt; is Χτε. This pattern works for root aorists, it works for thematic aorists, and it works for sigmatic aorists. Following that, a speaker familiar with ἔδωκα-ς might naturally produce ἐδώκα-τε. It would be obvious to listeners what was meant, even if the form ἔδοτε was the &#34;correct&#34; one. Over time, ἐδώκατε might be accepted and eventually dominate. A similar thing presumably happened with all the plural forms.&lt;/p&gt;
&lt;p&gt;We&#39;re almost done with an initial tour of the indicative aorist actives with just a couple more paradigms to look at before we switch to the middles.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part forty of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Working with the Diorisis Ancient Greek Corpus</title>
    <link href="https://jktauber.com/2020/01/20/working-with-the-diorisis-ancient-greek-corpus/" rel="alternate" type="text/html" title="Working with the Diorisis Ancient Greek Corpus"/>
    <published>2020-01-20T09:03:54-05:00</published>
    <updated>2020-01-20T09:03:54-05:00</updated>
    <id>https://jktauber.com/2020/01/20/working-with-the-diorisis-ancient-greek-corpus</id>
    <content type="html" xml:base="https://jktauber.com/2020/01/20/working-with-the-diorisis-ancient-greek-corpus/">&lt;p&gt;I&#39;ve recently started working on cleaning up the Diorisis Ancient Greek Corpus for my own vocabulary and morphology work as well as potential use in Scaife.&lt;/p&gt;
&lt;p&gt;I can&#39;t remember if I simply didn&#39;t know about the Diorisis corpus until recently or simply had put it on a list of things to look at one day and forgot to get back to to it. But it was Rodda et al&#39;s &lt;em&gt;Vector space models of Ancient Greek word meaning, and a case study on Homer&lt;/em&gt; in &lt;a href=&#34;https://www.atala.org/content/tal-et-humanités-numériques&#34;&gt;TAL 60.3&lt;/a&gt; that put me (back) on to it.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://brill.com/view/journals/rdj/3/1/article-p55_55.xml?language=en&#34;&gt;The Diorisis Ancient Greek Corpus&lt;/a&gt; was produced by Alessandro Vatri and Barbara McGillivray and is a 10-million word corpus of 820 texts from Perseus and some other sources (in TEI XML, non-TEI XML, HTML, and apparently Microsoft Word).&lt;/p&gt;
&lt;p&gt;Vatri and McGillivray compiled it for studying semantic change but obviously it&#39;s useful for a number of my own research interests. I&#39;ve been working with Giuseppe Celano&#39;s lemmatisation (also used in Scaife) but it has some problems and I&#39;d always planned to clean it up a bit. Diorisis excited me as a potentially better curated (if smaller) corpus.&lt;/p&gt;
&lt;p&gt;I also liked the fact that the Diorisis corpus had work-level metadata with dates and genre (which I&#39;ve always wanted on my corpus for a variety of reasons). And of course, it is cc-by-sa licensed.&lt;/p&gt;
&lt;p&gt;So I started a repo to begin processing:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/diorisis&#34;&gt;https://github.com/jtauber/diorisis&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The first thing I did was write a script to extract the work-level metadata, with this result:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/diorisis/blob/master/catalog.tsv&#34;&gt;https://github.com/jtauber/diorisis/blob/master/catalog.tsv&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Then I started extracting the token-level data. Like the Celano data, the XML format used for the analysed corpus is huge (over 2GB on disk) and I wanted something a little more normalised along the lines of other work I&#39;m doing.&lt;/p&gt;
&lt;p&gt;But here&#39;s where I hit my first problem. Vatri and McGillivry made the odd decision to use BetaCode for the corpus word forms (although not the lemmas). What is even more odd is their paper (linked above) argues for the benefits of BetaCode over Unicode. The arguments, however, seem to stem from a misunderstanding of Unicode.&lt;/p&gt;
&lt;p&gt;Section 3.3 of their paper begins:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;All Greek characters have been converted to Beta Code, in order to adopt a uniform and consistent encoding and with a view to automatic parsing and lemmatization. For these purposes, Beta Code was chosen because of its flexibility and ease of use in the following look-up operations:&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They then proceed with three arguments. The first is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Word-forms to be automatically analysed and annotated may or may not start with a capital letter; in order to be matched to entries in a digital dictionary, forms should be converted to the formats corresponding to the entries. Greek lowercase and uppercase letters are encoded as different characters in the Unicode table (e.g. the lower-case letter α corresponds to utf-8 code 0391, the upper-case letter A corresponds to utf-8 code 03B1), which would require an ad-hoc conversion for each character between its lower-case and upper-case versions. Beta Code simply encodes capitalization through the juxtaposition of an asterisk (*) character (lower-case α is encoded as A, and upper-case A is encoded as *A), which can be easily added or removed in the look-up process.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Firstly, this confuses Unicode code points with UTF-8 encoding forms. Secondly, they get the Unicode code points for α and Α around the wrong way. α is U+03B1 (which in UTF-8 would be &lt;code&gt;CE B1&lt;/code&gt;). Thirdly, &#34;ad-hoc conversion for each character between its lower-case and upper-case versions&#34; is an overstatement of the problem. A simple &lt;code&gt;.lower()&lt;/code&gt; in Python, for example, is arguably easier than removing &lt;code&gt;*&lt;/code&gt; (and certainly not &#34;ad-hoc&#34;) especially when one considers the wide range of accent and breathing placements one finds in uppercase BetaCode characters in the wild.&lt;/p&gt;
&lt;p&gt;Their second argument is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Diacritics such as the Greek diaeresis ( ̈) may or may not appear in dictionary entries (for instance, editors may add them to Greek words to mark hiatuses in metrical texts). Greek characters containing the diaeresis (alone or in combination with other diacritic marks) all have different utf-8 codes (e.g. ϊ = 03ca, ΐ = 0390, ῒ = 1fd2, ῗ = 1fd7), whereas Beta Code encodes the diaeresis through the juxtaposition of a plus sign (+; e.g. ϊ = I+, ΐ = I/+, ῒ = I\+, ῗ = I=+). This makes it very easy to process diacritics in the look-up process.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Again there is a confusion between Unicode code points and UTF-8. &#34;03ca&#34; is a code point not UTF-8. But more significantly, the argument that BetaCode is superior because it encodes the diaereses as a separate character completely ignores decomposed characters in Unicode. Ironically, copy-pasting from their paper, ϊ is already decomposed as two code points:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;U+03B9 GREEK SMALL LETTER IOTA&lt;/li&gt;
&lt;li&gt;U+0308 COMBINING DIAERESIS&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In other words, Unicode provides exactly what they are asking for.&lt;/p&gt;
&lt;p&gt;Finally, they argue:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In ag orthography, the grave accent (`) is only used to mark the alteration of the pitch normally marked by an acute accent in connected speech; thus, it never appears in dictionary entries (which only contain acute or circumflex accents). Whereas Unicode has different codes for Greek characters with an acute or a grave accent, Beta Code encodes such diacritics as forward (/) and backward (\) slashes, respectively; this makes grave accents easy to convert into acute accents in the look-up process.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Again this ignores Unicode normalisation and decomposed characters. It is really no harder to convert graves to accents with Unicode than with BetaCode.&lt;/p&gt;
&lt;p&gt;These misunderstandings and misrepresentations of Unicode would be one thing if my argument was just that Unicode is no worse than BetaCode. But the choice of BetaCode is problematic for other reasons.&lt;/p&gt;
&lt;p&gt;Most of these problems have to do with &lt;code&gt;(&lt;/code&gt; and &lt;em&gt;especially&lt;/em&gt; &lt;code&gt;)&lt;/code&gt;. BetaCode texts &lt;em&gt;should&lt;/em&gt; use &lt;code&gt;&#39;&lt;/code&gt; for apostrophes marking elision. The Diorisis corpus sometimes does. But it also (in around six thousand cases by my initial estimate) uses &lt;code&gt;)&lt;/code&gt; instead. And so we have &lt;code&gt;KAT)&lt;/code&gt;  for κατ’ instead of &lt;code&gt;KAT&#39;&lt;/code&gt;. Diorisis is hardly the only culprit here. In Perseus I still find cases where &lt;code&gt;)&lt;/code&gt; was used in BetaCode so the (incorrect) Unicode comes out κατ̓. To make matters worse, &lt;code&gt;(&lt;/code&gt; and &lt;code&gt;)&lt;/code&gt; are also used for actual parentheses.&lt;/p&gt;
&lt;p&gt;And so in BetaCode in the wild, &lt;code&gt;)&lt;/code&gt; could mean smooth breathing or an apostrophe or a parenthesis. With BetaCode this has to be manually disambiguated. With Unicode it does not (unless incorrectly converted from ambiguous BetaCode).&lt;/p&gt;
&lt;p&gt;And so my process of converting Diorisis to using Unicode is not a trivial one. My initial conversion code flagged almost ten thousand tokens that need to be manually checked. The majority of these seem to be &lt;code&gt;)&lt;/code&gt; for elision but some are for parentheses. Eyeballing what I have so far, there are also cases of multiple words not properly tokenised into two and also some bad text (OCR or keying errors) that needs to be corrected.&lt;/p&gt;
&lt;p&gt;Note that some of these issues were likely problems in the upstream text and so my task ahead is partly just doing that correction. But most of the work is addressing ambiguities in the BetaCode that would not exist if Unicode had been used everywhere. This makes the fact BetaCode was chosen for unnecessary reasons even more frustrating.&lt;/p&gt;
&lt;p&gt;One could argue I&#39;m talking at most about 0.1% of the text so I could just ignore problematic tokens. Given the automated lemmatisation is considerably less accurate than 99.9% (more like 90% at best) it might seem like a pedantic thing to focus on. But the problematic tokens tend to be of a particular type and aren&#39;t uniformly distributed in the corpus and so depending on the task the corpus is being used for, they can become more prominent than one might think from a figure like &#34;0.1%&#34;.&lt;/p&gt;
&lt;p&gt;My goal in the coming weeks is to have a slightly cleaned up Diorisis corpus completely in Unicode. This can then be used for some initial vocabulary stats work. My next goal after that is to improve the lemmatisation, initially using curated lemmatisations that did not exist when Diorisis was done. Longer term, I plan to continue to curate the lemmatisation.&lt;/p&gt;
&lt;p&gt;This improved Diorsis can then form the basis for a lot of the work I previously used the Celano analysis for. There will definitely be blog posts reporting status along the way.&lt;/p&gt;
&lt;p&gt;I am extremely grateful for the work that went into producing the Diorisis corpus. It is just a shame that a misunderstanding of Unicode led to a decision that is creating extra work now. But that will be overcome soon and hopefully my incremental improvements will turn out to be useful to others over time.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve recently started working on cleaning up the Diorisis Ancient Greek Corpus for my own vocabulary and morphology work as well as potential use in Scaife.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 39</title>
    <link href="https://jktauber.com/2020/01/05/a-tour-of-greek-morphology-part-39/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 39"/>
    <published>2020-01-05T16:00:00+08:00</published>
    <updated>2020-01-05T16:00:00+08:00</updated>
    <id>https://jktauber.com/2020/01/05/a-tour-of-greek-morphology-part-39</id>
    <content type="html" xml:base="https://jktauber.com/2020/01/05/a-tour-of-greek-morphology-part-39/">&lt;p&gt;Part thirty-nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Now we&#39;ll take an initial look at the aorist active infinitive and indicatives for λύω:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;λῦσαι&lt;/li&gt;
&lt;li&gt;ἔλυσα&lt;/li&gt;
&lt;li&gt;ἔλυσας&lt;/li&gt;
&lt;li&gt;ἔλυσε(ν)&lt;/li&gt;
&lt;li&gt;ἐλύσαμεν&lt;/li&gt;
&lt;li&gt;ἐλύσατε&lt;/li&gt;
&lt;li&gt;ἔλυσαν&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Probably the most common term for this type is &lt;strong&gt;first aorist&lt;/strong&gt; but this implies some ordering (versus the &#34;second aorists&#34; in particular) that isn&#39;t particular helpful in most cases.&lt;/p&gt;
&lt;p&gt;If we&#39;re contrasting the indicatives here with their imperfect equivalents, we&#39;d have&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;1SG&lt;/strong&gt; | Xσα    | Xον
| &lt;strong&gt;2SG&lt;/strong&gt; | Xσας   | Xες
| &lt;strong&gt;3SG&lt;/strong&gt; | Xσε(ν) | Xε(ν)
| &lt;strong&gt;1PL&lt;/strong&gt; | Xσαμεν | Xομεν
| &lt;strong&gt;2PL&lt;/strong&gt; | Xσατε  | Xετε
| &lt;strong&gt;3PL&lt;/strong&gt; | Xσαν   | Xον&lt;/p&gt;
&lt;p&gt;The existence of the sigma is why these are often alternatively called &lt;strong&gt;sigmatic aorists&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;But if we just look at the distinguishers within the paradigm, we can drop the sigma and get the following (with the thematic and root aorist distinguisher patterns shown for comparison):&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;1SG&lt;/strong&gt; | Xα    | Xον   | Xν
| &lt;strong&gt;2SG&lt;/strong&gt; | Xας   | Xες   | Xς
| &lt;strong&gt;3SG&lt;/strong&gt; | Xε(ν) | Xε(ν) | X
| &lt;strong&gt;1PL&lt;/strong&gt; | Xαμεν | Xομεν | Xμεν
| &lt;strong&gt;2PL&lt;/strong&gt; | Xατε  | Xετε  | Xτε
| &lt;strong&gt;3PL&lt;/strong&gt; | Xαν   | Xον   | Xσαν&lt;/p&gt;
&lt;p&gt;Observe that the &lt;strong&gt;2SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, &lt;strong&gt;2PL&lt;/strong&gt;, and &lt;strong&gt;3PL&lt;/strong&gt; all appear to have the same ending as the thematic aorists but with an alpha instead of the ε/ο theme vowel. This is why these aorists are often alternatively called &lt;strong&gt;alphathematic aorists&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In the &lt;strong&gt;1SG&lt;/strong&gt;, the alpha ending is actually related to the ν in the thematic and root aorists. If a final ν (coming from a Proto-Indo-European &lt;em&gt;-m) follows a consonant, it becomes an α in Greek (coming from a syllablic &lt;/em&gt;-m̥ in Proto-Indo-European). This is just a way of making an otherwise unpronounceable sequence pronounceable (in Greek). We see this same phenomenon in the accusative singular nouns (-ν when preceded by a vowel like in the 1st and 2nd declension, -α when preceded by a consonant in the 3rd declension).&lt;/p&gt;
&lt;p&gt;So ἔλυσα makes sense instead of ˣἔλυσν. But now we have an interesting question: is the -α the ending or part of the aorist stem? Its origins are clearly as the regular ending but in light of forms like ἔλυσας, ἐλύσαμεν, one might reanalyse it as part of what distinguishes this type of aorist.&lt;/p&gt;
&lt;p&gt;In the &lt;strong&gt;3SG&lt;/strong&gt; we find the bare ε (with movable nu in many cases) presumably by analogy with the thematic aorists. If an α had been used, it would be easily confused (without a nu) for the &lt;strong&gt;1SG&lt;/strong&gt; and (with the nu) for the &lt;strong&gt;3PL&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;3PL&lt;/strong&gt; is particularly interesting because it potentially explains the -σαν ending we see in the root aorists (and even athematic imperfects). Again this comes back to an interesting reanalysis.&lt;/p&gt;
&lt;p&gt;ἔλυσαν, if thought about in terms of the historical &lt;strong&gt;3PL&lt;/strong&gt; ending, could be segmented ἔλυσα-ν. If thought about in the context of the other personal endings in its paradigm, it could be segmented ἔλυσ-αν. But it could also be segmented ἔλυ-σαν, particularly in comparison to the imperfect. The morph -σαν could then have been internalised as indicating &lt;strong&gt;3PL&lt;/strong&gt; in the aorist or even aorist/imperfect.&lt;/p&gt;
&lt;p&gt;There&#39;s more we can say about this once we&#39;ve covered the perfect endings (probably a little while off!) as they&#39;re likely involved in this as well but this is yet another example of how morphology isn&#39;t really about concatenating morphemes with some compositional meaning resulting. It&#39;s a complex interaction of forms within a system.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 38</title>
    <link href="https://jktauber.com/2020/01/03/a-tour-of-greek-morphology-part-38/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 38"/>
    <published>2020-01-03T16:00:00+08:00</published>
    <updated>2020-01-03T16:00:00+08:00</updated>
    <id>https://jktauber.com/2020/01/03/a-tour-of-greek-morphology-part-38</id>
    <content type="html" xml:base="https://jktauber.com/2020/01/03/a-tour-of-greek-morphology-part-38/">&lt;p&gt;Part thirty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the last post, we introduced the root aorist actives. We&#39;ll now introduce another type of aorist active.&lt;/p&gt;
&lt;p&gt;Here are the aorist active infinitive and indicative forms of λαμβάνω:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | λαβεῖν   |
| &lt;strong&gt;1SG&lt;/strong&gt; | ἔλαβον   |
| &lt;strong&gt;2SG&lt;/strong&gt; | ἔλαβες   |
| &lt;strong&gt;3SG&lt;/strong&gt; | ἔλαβε(ν) |
| &lt;strong&gt;1PL&lt;/strong&gt; | ἐλάβομεν |
| &lt;strong&gt;2PL&lt;/strong&gt; | ἐλάβετε  |
| &lt;strong&gt;3PL&lt;/strong&gt; | ἔλαβον   |&lt;/p&gt;
&lt;p&gt;Notice that the infinitive -εῖν is like the present (but with a circumflex) and the indicative distinguishers follow &lt;strong&gt;IA-1&lt;/strong&gt; exactly.&lt;/p&gt;
&lt;p&gt;These distinguishers, just as with the thematic imperfects, consist of a theme vowel (an ablauting ε/ο) with the usual endings:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;1SG&lt;/strong&gt; | Xον   | ο + ν   | |
| &lt;strong&gt;2SG&lt;/strong&gt; | Xες   | ε + ς   | |
| &lt;strong&gt;3SG&lt;/strong&gt; | Xε(ν) | ε + -   | movable nu |
| &lt;strong&gt;1PL&lt;/strong&gt; | Xομεν | ο + μεν | |
| &lt;strong&gt;2PL&lt;/strong&gt; | Xετε  | ε + τε  | |
| &lt;strong&gt;3PL&lt;/strong&gt; | Xον   | ο + ν   | historically ντ but final τ dropping off |&lt;/p&gt;
&lt;p&gt;These are often called &#34;second&#34; aorists (although we haven&#39;t looked at the so-called &#34;first&#34; aorists yet). I&#39;ll generally avoid that term and instead use the term &lt;strong&gt;thematic aorist&lt;/strong&gt; because of the theme vowel. Focusing on this distinctive makes it clearer what&#39;s going on with these types of aorist.&lt;/p&gt;
&lt;p&gt;However, the thematic aorist distinguisher patterns seem to pose an even bigger problem than the root aorist distinguisher patterns: how do these not get confused for imperfects (or presents in the case of the infinitive)?&lt;/p&gt;
&lt;p&gt;The answer is the same for the root aorists: the &lt;em&gt;stem&lt;/em&gt; itself is also conveying grammatical information.&lt;/p&gt;
&lt;p&gt;The present/imperfect stem is λαμβαν+ε/ο but the aorist stem is λαβ+ε/ο. So λαβεῖν cannot be confused for the present infinitive because that would be λαμβάνειν. ἔλαβον canot be confused for the imperfect &lt;strong&gt;1SG&lt;/strong&gt; or &lt;strong&gt;3PL&lt;/strong&gt; because they would be ἐλάμβανον. ἔλαβες cannot be confused for the imperfect &lt;strong&gt;2SG&lt;/strong&gt; because that would be ἐλάμβανες.&lt;/p&gt;
&lt;p&gt;This does mean, however, that you need to know the stems. If you don&#39;t know λαμβαν- / λαβ- at all, you won&#39;t know whether ἔλαβες is imperfect or aorist. Xες is ambiguous as to aspect unless you know whether X corresponds to a imperfective (present/imperfect) stem or a perfective (aorist) stem.&lt;/p&gt;
&lt;p&gt;Here are some other examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;εὑρίσκω has the imperfective stem εὑρισκ+ε/ο but the perfective stem εὑρ+ε/ο&lt;/li&gt;
&lt;li&gt;ὁράω has the imperfective stem ὁρα+ε/ο but the perfective stem ἰδ+ε/ο (we&#39;ll discuss later why this augments as εἰδ-)&lt;/li&gt;
&lt;li&gt;ἔρχομαι has the imperfective stem ἐρχ+ε/ο but the perfective stem ἐλθ+ε/ο&lt;/li&gt;
&lt;li&gt;λέγω has the imperfective stem λεγ+ε/ο but the perfective stem εἰπ+ε/ο&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We&#39;ll talk a lot more about the relationship between these stems in future posts so don&#39;t worry about those details just yet. The main thing I want to start to get across here is that the endings don&#39;t discriminate imperfective and perfective. The stem itself indicates both the lexeme AND the aspect. For this reason, they are sometimes called aspect stems and, as we have already done above, we can refer to the perfective stem or the imperfective stem. Lots more on that soon!&lt;/p&gt;
&lt;p&gt;This has implications for morphological theory and morpheme-based approaches. There&#39;s no &#34;morpheme&#34; in ἔλαβες expressing just the perfective aspect.&lt;/p&gt;
&lt;p&gt;We&#39;ll end this post summarising the differences between the root aorists and thematic aorists (which otherwise share the same endings):&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;root aorists&lt;/th&gt;
&lt;th&gt;thematic aorists&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;no thematic vowel&lt;/td&gt;
&lt;td&gt;thematic vowel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;infinitive -ναι&lt;/td&gt;
&lt;td&gt;infinitive -εῖν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3rd plural ending -σαν&lt;/td&gt;
&lt;td&gt;3rd plural ending -ν &amp;lt; -ντ&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 37</title>
    <link href="https://jktauber.com/2020/01/02/a-tour-of-greek-morphology-part-37/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 37"/>
    <published>2020-01-02T16:00:00+08:00</published>
    <updated>2020-01-02T16:00:00+08:00</updated>
    <id>https://jktauber.com/2020/01/02/a-tour-of-greek-morphology-part-37</id>
    <content type="html" xml:base="https://jktauber.com/2020/01/02/a-tour-of-greek-morphology-part-37/">&lt;p&gt;Part thirty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;For our exploration of the aorist forms, we&#39;re going to start with the aorist active infinitive and indicatives of βαίνω:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;βῆναι&lt;/li&gt;
&lt;li&gt;ἔβην&lt;/li&gt;
&lt;li&gt;ἔβης&lt;/li&gt;
&lt;li&gt;ἔβη&lt;/li&gt;
&lt;li&gt;ἔβημεν&lt;/li&gt;
&lt;li&gt;ἔβητε&lt;/li&gt;
&lt;li&gt;ἔβησαν&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This may seem a little unusual (don&#39;t worry, we&#39;ll get to the aorist forms of λύω soon enough) but it will turn out to lay a better foundation, I think.&lt;/p&gt;
&lt;p&gt;Here are a few more paradigms of the same type:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;γι(γ)νώσκω&lt;/th&gt;
&lt;th&gt;βαίνω&lt;/th&gt;
&lt;th&gt;ἵστημι&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;γνῶναι*&lt;/td&gt;
&lt;td&gt;βῆναι (διαβῆναι*)&lt;/td&gt;
&lt;td&gt;στῆναι*&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔγνων*&lt;/td&gt;
&lt;td&gt;ἔβην (ἀνέβην*)&lt;/td&gt;
&lt;td&gt;ἔστην (ἀντέστην*)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔγνως*&lt;/td&gt;
&lt;td&gt;ἔβης&lt;/td&gt;
&lt;td&gt;ἔστης&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔγνω*&lt;/td&gt;
&lt;td&gt;ἔβη (ἀνέβη*)&lt;/td&gt;
&lt;td&gt;ἔστη*&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔγνωμεν (ἐπέγνωμεν*)&lt;/td&gt;
&lt;td&gt;ἔβημεν (ἐνέβημεν*)&lt;/td&gt;
&lt;td&gt;ἔστημεν (ἐξέστημεν*)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔγνωτε (ἐπέγνωτε*)&lt;/td&gt;
&lt;td&gt;ἔβητε&lt;/td&gt;
&lt;td&gt;ἔστητε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἔγνωσαν*&lt;/td&gt;
&lt;td&gt;ἔβησαν (ἀνέβησαν*)&lt;/td&gt;
&lt;td&gt;ἔστησαν*&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;* indicates that the form appears in SBLGNT. Where the base form does not appear but a compound with a preverb does, I&#39;ve included that in parentheses.&lt;/p&gt;
&lt;p&gt;Note the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the &lt;strong&gt;INF&lt;/strong&gt; does not have an augment but the indicatives do&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;INF&lt;/strong&gt; is always a properispomenon. In other words, it has a circumflex on the penultimate syllable. This could be explained in the above cases by the ending being -εναι with contraction taking place (although we&#39;d want other evidence to be sure)&lt;/li&gt;
&lt;li&gt;the consistent, lexeme-specific part of the form within a paradigm is a consonant or consontant cluster followed by a long vowel: γνω, βη, στη&lt;/li&gt;
&lt;li&gt;the present/imperfect stem and aorist stem are not the same and, in fact, the relationship between the present/imperfect stem and aorist stem appears to be different for each lexeme so far!&lt;/li&gt;
&lt;li&gt;the regular recessive accent means the indicative forms always end up having an acute on the augment&lt;/li&gt;
&lt;li&gt;there is no thematic vowel (i.e. no ablauting ε/o at the end of the stem)&lt;/li&gt;
&lt;li&gt;there is no vowel length alternation between the singular and plural&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;3PL&lt;/strong&gt; ending is -σαν like the athematic imperfects&lt;/li&gt;
&lt;li&gt;the rest of the endings are like all the &lt;strong&gt;IA&lt;/strong&gt; from &lt;strong&gt;IA-1&lt;/strong&gt; through &lt;strong&gt;IA-9&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;the fact the endings are like the &lt;strong&gt;IA&lt;/strong&gt; would lead to lack of distinction between the imperfect and aorist if not for the stem differences!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To summarise, our distinguishers (augment aside) are:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;INF&lt;/strong&gt; | -ναι |
| &lt;strong&gt;1SG&lt;/strong&gt; | -ν   |
| &lt;strong&gt;2SG&lt;/strong&gt; | -ς   |
| &lt;strong&gt;3SG&lt;/strong&gt; | -    |
| &lt;strong&gt;1PL&lt;/strong&gt; | -μεν |
| &lt;strong&gt;2PL&lt;/strong&gt; | -τε  |
| &lt;strong&gt;3PL&lt;/strong&gt; | -σαν |&lt;/p&gt;
&lt;p&gt;Because the endings go directly on the verbal root with no thematic vowel and with no other morphological changes, these aorists are often called &lt;strong&gt;root aorists&lt;/strong&gt;. They&#39;re not normally introduced first (they aren&#39;t common by number of distinct lexemes, although are reasonably so by token count) but I&#39;ve chosen to start with them in this tour because they lay a good foundation for comparing and contrasting other types of aorist.&lt;/p&gt;
&lt;p&gt;In the next post of this tour, we&#39;ll introduce another of these types: aorists that &lt;em&gt;do&lt;/em&gt; have a thematic vowel.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 36</title>
    <link href="https://jktauber.com/2019/12/31/a-tour-of-greek-morphology-part-36/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 36"/>
    <published>2019-12-31T16:00:00+08:00</published>
    <updated>2019-12-31T16:00:00+08:00</updated>
    <id>https://jktauber.com/2019/12/31/a-tour-of-greek-morphology-part-36</id>
    <content type="html" xml:base="https://jktauber.com/2019/12/31/a-tour-of-greek-morphology-part-36/">&lt;p&gt;Part thirty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;We&#39;ve now spent a lot of time looking at distinguishers and inflectional classes within each of the indicative personal ending paradigms (present active, present middle, imperfect active, and imperfect middle) of each lexeme. We also checked the consistency of the lexeme-specific part (the X or &#34;theme&#34;) in each paradigm.&lt;/p&gt;
&lt;p&gt;But we haven&#39;t really talked about the consistency of the lexeme-specific part across the PAs, PMs, IAs, and IMs. Perhaps not surprisingly, the same theme is used by a lexeme for both the PA and the PM (if both voices are used) and likewise the IA and IM. In other words, voice is not indicated by the theme in the present and imperfect, only by the set of endings used.&lt;/p&gt;
&lt;p&gt;But what about the theme consistency between the present and the imperfect? That&#39;s what we&#39;ll look at now.&lt;/p&gt;
&lt;p&gt;Given that the present and imperfect differ in their sets of endings (other than in the &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt;) there is not a &lt;em&gt;huge&lt;/em&gt; need to use any additional mechanism to express present versus impefect.&lt;/p&gt;
&lt;p&gt;But as mentioned when we first started with the imperfects, there &lt;em&gt;is&lt;/em&gt; difference besides the endings, namely the &lt;strong&gt;augment&lt;/strong&gt; in the imperfect: typically either a prefixed ε before an initial consonant or a lengthed initial vowel.&lt;/p&gt;
&lt;p&gt;The situation is made slightly more complex by the fact that this augmentation applies before the incorporation of any prepositional &#34;preverb&#34;. Greek had a quite productive way of forming new verbs by prefixing base verbs with certain prepositions (a topic worthy of some posts another time).&lt;/p&gt;
&lt;p&gt;But for our discussion of augments and the relationship between the present and imperfect forms, we will firstly look at the cases where there is no preverb.&lt;/p&gt;
&lt;p&gt;There are 108 lemmas in the SBLGNT without preverbs that start with a consonant in the present and so in the imperfect are just prefixed with ἐ- (e.g. βλεπ- ~ ἐβλεπ-; διδ- ~ ἐδιδ-).&lt;/p&gt;
&lt;p&gt;The situation is a little different when the present starts with a vowel. In such cases, the vowel essentially lengthens.&lt;/p&gt;
&lt;p&gt;In three cases (ἐάω, ἕλκω, ἔχω) ε becomes ει (e.g. ἐχ- ~ εἰχ-)&lt;/p&gt;
&lt;p&gt;In six cases (ἐγγίζω, ἐλπίζω, ἐργάζομαι, ἔρχομαι, ἐρωτάω, ἐσθίω) the ε becomes η (e.g. ἐρχ- ~ ἠρχ-).&lt;/p&gt;
&lt;p&gt;We&#39;ll explore the difference in a later post.&lt;/p&gt;
&lt;p&gt;Initial ευ stays as ευ in two cases (εὐδοκέω, εὐπορέομαι) but becomes ηυ in one (εὔχομαι ~ ηὐχόμην). Sometimes this happens within a single lexeme too (see the end of this post).&lt;/p&gt;
&lt;p&gt;In the five cases of an initial ο- (ὁμιλέω, ὁμολογέω, ὀνειδίζω, ὀρθρίζω, ὀφείλω), it becomes ω. Note that it does &lt;em&gt;not&lt;/em&gt; become ου.&lt;/p&gt;
&lt;p&gt;οι in οἰκοδομέω becomes ῳ (οἰκοδομ- ~ ᾠκοδομ-).&lt;/p&gt;
&lt;p&gt;The one case of an initial η (ἥκω) stays as η.&lt;/p&gt;
&lt;p&gt;The vowels α, ι, and υ which can be short or long just become their long variety but of course long α generally becomes η in Attic and Koine without a preceding ι, ε, or ρ.&lt;/p&gt;
&lt;p&gt;There are 19 cases of α becoming η (e.g. &lt;strong&gt;2PL&lt;/strong&gt; ἀγαπᾶτε ~ ἠγαπᾶτε).&lt;/p&gt;
&lt;p&gt;There are two cases of ι becoming a long ι (ἰάομαι, ἰσχύω) and one of υ becoming a long υ (ὑμνέω).&lt;/p&gt;
&lt;p&gt;In one case (αἰτέω) αι becomes ῃ and in two cases (αὐλίζομαι, αὐξάνω) αυ becomes ηυ.&lt;/p&gt;
&lt;p&gt;We now turn to those verbs with a preverb (or which augment as if they do).&lt;/p&gt;
&lt;p&gt;Because a lot of prepositions end in vowels or other sounds that interact with the start of the verb root or with the augment there is often elision or assimilation.&lt;/p&gt;
&lt;p&gt;For example ἀναβαίν- (ανα + βαιν-) in the present becomes ἀνεβαιν- (αν’ + ε + βαιν-) in the imperfect. The ἀνα- is intact in the present but elided in the imperfect. ἀνεχ- in the present becomes ἀνειχ- in the imperfect: ἀνα- elided to ἀν’- in both present and imperfect.&lt;/p&gt;
&lt;p&gt;The ε augment often has the effect of breaking the consecutive consonants that assimilate in the present. For example ἐμβλεπ- in the present becomes ἐνεβλεπ- in the imperfect, the ν in ἐν- no longer becoming μ in the presence of the following labial β.&lt;/p&gt;
&lt;p&gt;συν- is particularly worth observing because you get things like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;συγχαιρ- ~ συνεχαιρ-&lt;/li&gt;
&lt;li&gt;συζητ- ~ συνεζητ-&lt;/li&gt;
&lt;li&gt;συλλαλ- ~ συνελαλ-&lt;/li&gt;
&lt;li&gt;συμβαιν- ~ συνεβαιν-&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At some point it might be fun to whip up the exact finite-state model for all this but for now, I&#39;ll just note the counts.&lt;/p&gt;
&lt;p&gt;There are 100 examples with preverbs plus a consonant-initial verb stem.&lt;/p&gt;
&lt;p&gt;Preverbs plus a vowel-initial verb stem follow the same sound rules as without a preverb and the expected elision can be found.&lt;/p&gt;
&lt;p&gt;There are 16 cases of α&amp;gt;η with a preverb (e.g. περιαγ- ~ περιηγ-; ἀπαγγελλ- ~ ἀπηγγελλ-). There are two cases of αι&amp;gt;ῃ with a preverb, 8 cases of ε&amp;gt;ει, 11 cases of ε&amp;gt;η, one case of ευ&amp;gt;ευ, 4 cases of η&amp;gt;η, 6 cases of ι&amp;gt;ι, and 3 cases of ο&amp;gt;ω.&lt;/p&gt;
&lt;p&gt;Here&#39;s a summary of all these counts:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;no preverb&lt;/th&gt;
&lt;th&gt;with preverb&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;C &amp;gt; εC&lt;/td&gt;
&lt;td&gt;108&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ε &amp;gt; ει&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ε &amp;gt; η&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ευ &amp;gt; ευ&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ευ &amp;gt; ηυ&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ο &amp;gt; ω&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;οι &amp;gt; ῳ&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;η &amp;gt; η&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;α &amp;gt; η&lt;/td&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ι &amp;gt; ι&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;υ &amp;gt; υ&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;αι &amp;gt; ῃ&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;αυ &amp;gt; ηυ&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;There is the interesting case of εὐαγγελίζω which is treated in Acts as if εὐ were a preverb and the imperfect form εὐηγγελίζοντο is found (wth α&amp;gt;η). This is not counted in the 16 above.&lt;/p&gt;
&lt;p&gt;We also in Acts find προορώμην which does not have an augment but which clearly has the imperfect middle ending.&lt;/p&gt;
&lt;p&gt;Not included in the counts above are εἰμί and compounds which are probably worth their own post at some point (although note that for the most part the imperfect stem is just η). Also not included are εἶμι and compounds where the imperfect stem is ῃ.&lt;/p&gt;
&lt;p&gt;We have a handful of cases where the lemma having multiple inflectional classes prevents a trivial mapping between the present and imperfect stems in all instances (ἀφίημι, δέω, συγχέω, ἵστημι, and πλέω, each of which we&#39;ve discussed before.) Once the right stem is chosen to map to, the rules above apply cleanly.&lt;/p&gt;
&lt;p&gt;That leaves us with five imperfect stems whose relationship to the present stem has not yet been covered. They are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;δύναμαι having imperfects in both ἐδυν- and ἠδυν-&lt;/li&gt;
&lt;li&gt;μέλλω having imperfects in both ἐμελλ- and ἠμελλ-&lt;/li&gt;
&lt;li&gt;θέλω having an imperfect in ἠθελ- (because the present was originally ἐθελ-)&lt;/li&gt;
&lt;li&gt;εὐκαιρέω having an imperfect in εὐκαιρ- and ηὐκαιρ-&lt;/li&gt;
&lt;li&gt;εὑρίσκω having an imperfect in εὑρισκ- and ηὑρισκ=&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Other than these and the ε&amp;gt;ει versus ε&amp;gt;η distinction, augmentation of the present stem to form the imperfect is entirely consistent and predictable.&lt;/p&gt;
&lt;p&gt;We&#39;ll dive into more detail on the augment later on but we&#39;ve now reached a good time to leave behind the present/imperfect and start to look at the aorist in the new year!&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 35</title>
    <link href="https://jktauber.com/2019/12/30/a-tour-of-greek-morphology-part-35/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 35"/>
    <published>2019-12-30T16:00:00+08:00</published>
    <updated>2019-12-30T16:00:00+08:00</updated>
    <id>https://jktauber.com/2019/12/30/a-tour-of-greek-morphology-part-35</id>
    <content type="html" xml:base="https://jktauber.com/2019/12/30/a-tour-of-greek-morphology-part-35/">&lt;p&gt;Part thirty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;To finish up our coverage of the imperfect indicative endings, we&#39;ll now use the disambiguation we did in the previous post to produce SBLGNT counts for the imperfect middles.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;class&lt;/th&gt;
&lt;th&gt;# lemmas&lt;/th&gt;
&lt;th&gt;# tokens&lt;/th&gt;
&lt;th&gt;# hapakes&lt;/th&gt;
&lt;th&gt;lemmas (* = hapax)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;54&lt;/td&gt;
&lt;td&gt;124&lt;/td&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;ἐμβριμάομαι&lt;em&gt; ἰάομαι προοράω&lt;/em&gt; ἐπακροάομαι&lt;em&gt; πειράω&lt;/em&gt; μασάομαι*&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;χράομαι*&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-6&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-7&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;ἐκτίθημι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-8&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-9&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;28&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;ἐξίστημι δύναμαι ἀφίσταμαι&lt;em&gt; ἀνθίστημι&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-10&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;εἰμί&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-11&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;συνανάκειμαι ἀνάκειμαι* κεῖμαι κατάκειμαι ἐπίκειμαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM-12&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;κάθημαι&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;And the counts for each paradigm cell for each class:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-1&lt;/th&gt;
&lt;th&gt;IM-2&lt;/th&gt;
&lt;th&gt;IM-3&lt;/th&gt;
&lt;th&gt;IM-4&lt;/th&gt;
&lt;th&gt;IM-5&lt;/th&gt;
&lt;th&gt;IM-6&lt;/th&gt;
&lt;th&gt;IM-7&lt;/th&gt;
&lt;th&gt;IM-8&lt;/th&gt;
&lt;th&gt;IM-9&lt;/th&gt;
&lt;th&gt;IM-10&lt;/th&gt;
&lt;th&gt;IM-11&lt;/th&gt;
&lt;th&gt;IM-12&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;66&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;46&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TOTAL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;124&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;27&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;7&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;28&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;20&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;13&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;11&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;And just like we did for the actives, let&#39;s summarise which forms and distinguisher patterns are most common:&lt;/p&gt;
&lt;p&gt;|          | IM-1       | IM-2       | IM-4       | IM-5       | IM-7       | IM-9       | IM-10      | IM-11      | IM-12      |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| &lt;strong&gt;1SG&lt;/strong&gt; |  ἐβουλόμην&amp;nbsp;3/8 + other&amp;nbsp;-όμην |  ἐφοβούμην&amp;nbsp;1/1 |  προορώμην&amp;nbsp;1/1 |  |  |  |  &lt;strong&gt;ἤμην&lt;/strong&gt;&amp;nbsp;15/15 |  |  |
| &lt;strong&gt;2SG&lt;/strong&gt; |  ἤρχου&amp;nbsp;1/1 |  |   |  |  |  |  |  |  |
| &lt;strong&gt;3SG&lt;/strong&gt; |  -ετο |  -εῖτο |  ἰᾶτο&amp;nbsp;2/2 |  |  ἐξετίθετο&amp;nbsp;2/2  |  &lt;strong&gt;ἐδύνατο&lt;/strong&gt;&amp;nbsp;11/18 + other&amp;nbsp;-ατο |  |  ἔκειτο&amp;nbsp;4/10 κατέκειτο&amp;nbsp;4/10 + other&amp;nbsp;-ειτο |  &lt;strong&gt;ἐκάθητο&lt;/strong&gt;&amp;nbsp;11/11 |
| &lt;strong&gt;1PL&lt;/strong&gt; |  ἐπορευόμεθα&amp;nbsp;1/1 |  |  |  |  |  |  ἤμεθα&amp;nbsp;5/5 |  |  |
| &lt;strong&gt;2PL&lt;/strong&gt; |  διελογίζεσθε&amp;nbsp;1/2 ἀνείχεσθε&amp;nbsp;1/2 |  ἠκαιρεῖσθε&amp;nbsp;1/1 |  |  |  |  ἐδύνασθε&amp;nbsp;1/1 |  |  |  |
| &lt;strong&gt;3PL&lt;/strong&gt; |  -οντο  |  &lt;strong&gt;ἐφοβοῦντο&lt;/strong&gt;&amp;nbsp;10/17 + other&amp;nbsp;-οῦντο |  ἐνεβριμῶντο&amp;nbsp;1/4 ἐπηκροῶντο&amp;nbsp;1/4 ἐπειρῶντο&amp;nbsp;1/4 ἐμασῶντο&amp;nbsp;1/4 |  ἐχρῶντο&amp;nbsp;1/1 |  |  ἐξίσταντο&amp;nbsp;6/9 ἠδύναντο&amp;nbsp;3/9 |  |  συνανέκειντο&amp;nbsp;2/3 ἐπέκειντο&amp;nbsp;1/3 |  |&lt;/p&gt;
&lt;p&gt;That brings our discussion of the imperfect middles up to where we got to three posts ago with the imperfect actives and where we got with the present indicatives many posts ago.&lt;/p&gt;
&lt;p&gt;There&#39;s one more post I&#39;ll do to close out the year (before moving on to the aorist system in 2020). I want to look at the relationship between the &#34;X&#34; in the present and the &#34;X&#34; in the imperfect within each lexeme for which we have forms in each. In other words, we&#39;ll take a closer (although not yet comprehensive) look at the augment.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 34</title>
    <link href="https://jktauber.com/2019/12/29/a-tour-of-greek-morphology-part-34/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 34"/>
    <published>2019-12-29T16:00:00+08:00</published>
    <updated>2019-12-29T16:00:00+08:00</updated>
    <id>https://jktauber.com/2019/12/29/a-tour-of-greek-morphology-part-34</id>
    <content type="html" xml:base="https://jktauber.com/2019/12/29/a-tour-of-greek-morphology-part-34/">&lt;p&gt;Part thirty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;It&#39;s now time to sort out any inflectional class (IC) ambiguities in our imperfect middle endings. As usual, I&#39;ve written code evaluating the forms in the SBLGNT and assigning each one a single IC. The rules used are as follows:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;3SG&lt;/b&gt;:-ετο or
      &lt;b&gt;2PL&lt;/b&gt;:-εσθε
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IM-1&lt;/b&gt; if lemma ends in -ω or -ομαι&lt;br&gt;
      &lt;b&gt;IM-7&lt;/b&gt; if lemma ends in -ημι
    &lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:-όμην or
      &lt;b&gt;1PL&lt;/b&gt;:-όμεθα or
      &lt;b&gt;3PL&lt;/b&gt;:-οντο
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IM-1&lt;/b&gt; if lemma ends in -ω or -ομαι&lt;br&gt;
      &lt;b&gt;IM-8&lt;/b&gt; if lemma ends in -ωμι
    &lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:-ούμην or
      &lt;b&gt;2SG&lt;/b&gt;:-οῦ or
      &lt;b&gt;3PL&lt;/b&gt;:-οῦντο
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IM-2&lt;/b&gt; if lemma ends in -έω or -έομαι&lt;br&gt;
      &lt;b&gt;IM-3&lt;/b&gt; if lemma ends in -όω or -όομαι
    &lt;/td&gt;
  &lt;/tr&gt;

  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:-ώμην or
      &lt;b&gt;2SG&lt;/b&gt;:-ῶ or
      &lt;b&gt;1PL&lt;/b&gt;:-ώμεθα or
      &lt;b&gt;3PL&lt;/b&gt;:-ῶντο
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IM-5&lt;/b&gt; if lemma is χράομαι&lt;br&gt;
      &lt;b&gt;IM-4&lt;/b&gt; if lemma ends in -άω or -άομαι
    &lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;You can download the results of the disambiguation &lt;a href=&#34;https://gist.github.com/jtauber/4c9f0c975a28e722b54cfe0cec14e4b4&#34;&gt;here&lt;/a&gt;. We&#39;ll use this to do our counts in the next post.&lt;/p&gt;
&lt;p&gt;Let&#39;s first ask our usual questions, though...&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Are the disambiguated inflectional classes consistent for each lexeme?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Yes. In this case, they are so need no further comment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Is the value of X in our paradigm patterns consistent across a lexeme?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There&#39;s only one exception, to do with the augment:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;δύναμαι: ἐδυν or ἠδυν&lt;/li&gt;
&lt;/ul&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 33</title>
    <link href="https://jktauber.com/2019/12/28/a-tour-of-greek-morphology-part-33/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 33"/>
    <published>2019-12-28T16:00:00+08:00</published>
    <updated>2019-12-28T16:00:00+08:00</updated>
    <id>https://jktauber.com/2019/12/28/a-tour-of-greek-morphology-part-33</id>
    <content type="html" xml:base="https://jktauber.com/2019/12/28/a-tour-of-greek-morphology-part-33/">&lt;p&gt;Part thirty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;{% post_url 2019-11-29-a-tour-of-greek-morphology-part-29 %}&#34;&gt;part 29&lt;/a&gt;, we summarised our imperfect middle indicative paradigms as follows:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-1&lt;/th&gt;
&lt;th&gt;IM-2&lt;/th&gt;
&lt;th&gt;IM-3&lt;/th&gt;
&lt;th&gt;IM-4&lt;/th&gt;
&lt;th&gt;IM-5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xόμην&lt;/td&gt;
&lt;td&gt;Xούμην&lt;/td&gt;
&lt;td&gt;Xούμην&lt;/td&gt;
&lt;td&gt;Xώμην&lt;/td&gt;
&lt;td&gt;Xώμην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xοῦ&lt;/td&gt;
&lt;td&gt;Xοῦ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xετο&lt;/td&gt;
&lt;td&gt;Xεῖτο&lt;/td&gt;
&lt;td&gt;Xοῦτο&lt;/td&gt;
&lt;td&gt;Xᾶτο&lt;/td&gt;
&lt;td&gt;Xῆτο&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xεῖσθε&lt;/td&gt;
&lt;td&gt;Xοῦσθε&lt;/td&gt;
&lt;td&gt;Xᾶσθε&lt;/td&gt;
&lt;td&gt;Xῆσθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xοντο&lt;/td&gt;
&lt;td&gt;Xοῦντο&lt;/td&gt;
&lt;td&gt;Xοῦντο&lt;/td&gt;
&lt;td&gt;Xῶντο&lt;/td&gt;
&lt;td&gt;Xῶντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-6&lt;/th&gt;
&lt;th&gt;IM-7&lt;/th&gt;
&lt;th&gt;IM-8&lt;/th&gt;
&lt;th&gt;IM-9&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xύμην&lt;/td&gt;
&lt;td&gt;Xέμην&lt;/td&gt;
&lt;td&gt;Xόμην&lt;/td&gt;
&lt;td&gt;Xάμην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυσο&lt;/td&gt;
&lt;td&gt;Xεσο&lt;/td&gt;
&lt;td&gt;Xοσο&lt;/td&gt;
&lt;td&gt;Xασο/Xω&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυτο&lt;/td&gt;
&lt;td&gt;Xετο&lt;/td&gt;
&lt;td&gt;Xοτο&lt;/td&gt;
&lt;td&gt;Xατο&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xύμεθα&lt;/td&gt;
&lt;td&gt;Xέμεθα&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xάμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυσθε&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xοσθε&lt;/td&gt;
&lt;td&gt;Xασθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυντο&lt;/td&gt;
&lt;td&gt;Xεντο&lt;/td&gt;
&lt;td&gt;Xοντο&lt;/td&gt;
&lt;td&gt;Xαντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This does not quite cover the imperfect middle indicative forms we find in the SBLGNT.&lt;/p&gt;
&lt;p&gt;Recall in the previous post, we said the copula does not appear at all in the imperfect active &lt;strong&gt;1SG&lt;/strong&gt; in the SBLGNT. It does, however, appear in the middle form ἤμην 15 times.&lt;/p&gt;
&lt;p&gt;Furthermore, even though the &lt;strong&gt;1PL&lt;/strong&gt; active form ἦμεν does appear 8 times, we also find the middle form ἤμεθα 5 times. The SBLGNT text even has both in Galatians 4.3.&lt;/p&gt;
&lt;p&gt;We&#39;ll create an &lt;strong&gt;IM-10&lt;/strong&gt; class for these.&lt;/p&gt;
&lt;p&gt;We also have imperfect forms of κεῖμαι, so much like we created &lt;strong&gt;PM-11&lt;/strong&gt;, we&#39;ll create an &lt;strong&gt;IM-11&lt;/strong&gt; for &lt;strong&gt;3SG&lt;/strong&gt; forms like ἔκειτο, ἀνέκειτο, ἐπέκειτο, and κατέκειτο and for &lt;strong&gt;3PL&lt;/strong&gt; forms like ἐπέκειντο and συνανέκειντο.&lt;/p&gt;
&lt;p&gt;Finally, we have 11 occurences of ἐκάθητο, the imperfect middle &lt;strong&gt;3SG&lt;/strong&gt; form of κάθημαι, treated as if no longer being a preverb κατά +‎ ἧμαι. We&#39;ll create an &lt;strong&gt;IM-12&lt;/strong&gt; for this form.&lt;/p&gt;
&lt;p&gt;Hence, we have:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-10&lt;/th&gt;
&lt;th&gt;IM-11&lt;/th&gt;
&lt;th&gt;IM-12&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἤμην&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;Xειτο&lt;/td&gt;
&lt;td&gt;Xητο&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ἤμεθα&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;Xειντο&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The remaining cells would be straightforward to fill out, but as we&#39;re testing everything against a specific corpus and grammars, we&#39;ll have to extend the tests before we include other forms.&lt;/p&gt;
&lt;p&gt;In the next two posts, we&#39;ll finish up the middles with some disambiguation rules and corpus counts.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 32</title>
    <link href="https://jktauber.com/2019/12/27/a-tour-of-greek-morphology-part-32/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 32"/>
    <published>2019-12-27T16:00:00+08:00</published>
    <updated>2019-12-27T16:00:00+08:00</updated>
    <id>https://jktauber.com/2019/12/27/a-tour-of-greek-morphology-part-32</id>
    <content type="html" xml:base="https://jktauber.com/2019/12/27/a-tour-of-greek-morphology-part-32/">&lt;p&gt;Part thirty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;We&#39;ll now use the disambiguation we did in the previous post to produce SBLGNT counts for the imperfect actives like we have for the presents before. I&#39;ve included all the lemmas if the list is short enough (and marked the hapakes with an asterisk).&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;class&lt;/th&gt;
&lt;th&gt;# lemmas&lt;/th&gt;
&lt;th&gt;# tokens&lt;/th&gt;
&lt;th&gt;# hapakes&lt;/th&gt;
&lt;th&gt;lemmas (* = hapax)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-1&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;150&lt;/td&gt;
&lt;td&gt;540&lt;/td&gt;
&lt;td&gt;87&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;68&lt;/td&gt;
&lt;td&gt;239&lt;/td&gt;
&lt;td&gt;35&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;δολιόω&lt;em&gt; πληρόω&lt;/em&gt; ἀξιόω&lt;em&gt; δηλόω&lt;/em&gt; (and thematic forms of δίδωμι ἀποδίδωμι&lt;em&gt; παραδίδωμι&lt;/em&gt; )&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;60&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;ζάω&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-6a&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-7&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;τίθημι προστίθημι&lt;em&gt; ἐπιτίθημι&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-8&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;δίδωμι ἐπιδίδωμι* παραδίδωμι (and -σαν form of ἔχω)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-9&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;43&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;φημί&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-10&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;435&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;εἰμί&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-10-COMP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;σύνειμι* πάρειμι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IA-11-COMP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;ἄπειμι&lt;em&gt; ἔξειμι&lt;/em&gt; εἴσειμι&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;And the counts for each paradigm cell for each class:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-1&lt;/th&gt;
&lt;th&gt;IA-2&lt;/th&gt;
&lt;th&gt;IA-3&lt;/th&gt;
&lt;th&gt;IA-4&lt;/th&gt;
&lt;th&gt;IA-5&lt;/th&gt;
&lt;th&gt;IA-6a&lt;/th&gt;
&lt;th&gt;IA-7&lt;/th&gt;
&lt;th&gt;IA-8&lt;/th&gt;
&lt;th&gt;IA-9&lt;/th&gt;
&lt;th&gt;IA-10&lt;/th&gt;
&lt;th&gt;IA-10-COMP&lt;/th&gt;
&lt;th&gt;IA-11-COMP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;255&lt;/td&gt;
&lt;td&gt;127&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;43&lt;/td&gt;
&lt;td&gt;314&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;244&lt;/td&gt;
&lt;td&gt;97&lt;/td&gt;
&lt;td&gt;4+1&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;95&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;540&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;239&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;60&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;16&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;43&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;435&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;One of the things that&#39;s obvious about these numbers is the importance of the &lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt; forms. In every class other than &lt;strong&gt;IA-5&lt;/strong&gt; (ζάω) those two person-numbers dominate (and there are only two instances of ζάω anyway). Of the 11 inflection classes with forms in the SBLGNT, 7 of them ONLY have forms in either &lt;strong&gt;3SG&lt;/strong&gt; or &lt;strong&gt;3PL&lt;/strong&gt; or both. Notice that &lt;strong&gt;IA-9&lt;/strong&gt; appears only in the &lt;strong&gt;3SG&lt;/strong&gt; (i.e. ἔφη). Notice also that, despite some showing in the &lt;strong&gt;2SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, and &lt;strong&gt;2PL&lt;/strong&gt; the copula does not appear at all in the imperfect active &lt;strong&gt;1SG&lt;/strong&gt; in the SBLGNT.&lt;/p&gt;
&lt;p&gt;Just as a little experiment, what if we showed the imperfect active indicative paradigms with only what is found in the SBLGNT text and showed complete forms (not just the distinguisher pattern) in any case where a form made up 25% or more of the instances of that cell? The result would be the following:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-1&lt;/th&gt;
&lt;th&gt;IA-2&lt;/th&gt;
&lt;th&gt;IA-3&lt;/th&gt;
&lt;th&gt;IA-4&lt;/th&gt;
&lt;th&gt;IA-5&lt;/th&gt;
&lt;th&gt;IA-6a&lt;/th&gt;
&lt;th&gt;IA-7&lt;/th&gt;
&lt;th&gt;IA-8&lt;/th&gt;
&lt;th&gt;IA-9&lt;/th&gt;
&lt;th&gt;IA-10&lt;/th&gt;
&lt;th&gt;IA-10-COMP&lt;/th&gt;
&lt;th&gt;IA-11-COMP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xον&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;ἔζων&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἶχες, ἐζώννυες, ἤθελες&lt;/td&gt;
&lt;td&gt;περιεπάτεις&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;ἦς, ἦσθα&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ἔλεγε(ν)&lt;/strong&gt; + other Xε(ν)&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;ἐπλήρου, ἠξίου, ἐδήλου&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ἐπηρώτα&lt;/strong&gt; + other Xᾱ&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;προσετίθει, ἐτίθει&lt;/td&gt;
&lt;td&gt;ἐδίδου + other Xου&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ἔφη&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ἦν&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;εἰσῄει&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;ἐζητοῦμεν, ἐλαλοῦμεν, παρεκαλοῦμεν, εὐδοκοῦμεν&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;ἦμεν&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;εἴχετε, ἐπιστεύετε + other Xετε&lt;/td&gt;
&lt;td&gt;ἐζητεῖτε, ἐποιεῖτε, ἐφρονεῖτε&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;ἠγαπᾶτε&lt;/td&gt;
&lt;td&gt;ἐζῆτε&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ἦτε&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ἔλεγον&lt;/strong&gt; + other Xον&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;ἐδίδουν, ἀπεδίδουν, παρεδίδουν + ἐδολιοῦσαν&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ἐπηρώτων&lt;/strong&gt; + other Xων&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;ἐτίθεσαν, ἐπετίθεσαν&lt;/td&gt;
&lt;td&gt;εἴχοσαν, ἐδίδοσαν, παρεδίδοσαν&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ἦσαν&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;παρῆσαν, συνῆσαν&lt;/td&gt;
&lt;td&gt;ἀπῄεσαν, ἐξῄεσαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Here any form making up 25% or more of the tokens for that combination of inflectional class and person-number is show (if that&#39;s all that&#39;s shown in a cell, there are no other forms). Forms in bold also occur 10 times or more in the text,&lt;/p&gt;
&lt;p&gt;That wraps up our exploration of the indicative imperfect active endings. In the next three posts, we&#39;ll finish up the middles.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 31</title>
    <link href="https://jktauber.com/2019/12/26/a-tour-of-greek-morphology-part-31/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 31"/>
    <published>2019-12-26T16:00:00+08:00</published>
    <updated>2019-12-26T16:00:00+08:00</updated>
    <id>https://jktauber.com/2019/12/26/a-tour-of-greek-morphology-part-31</id>
    <content type="html" xml:base="https://jktauber.com/2019/12/26/a-tour-of-greek-morphology-part-31/">&lt;p&gt;Part thirty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the &lt;a href=&#34;{% post_url 2019-12-12-a-tour-of-greek-morphology-part-30 %}&#34;&gt;previous post&lt;/a&gt; we went through and made sure we had all our imperfect active indicative endings covered ready for counting. We still had some ambiguities, though, so we need to use rules based around the lemma to dismabiguate. We can then apply those rules to generate our data for counting.&lt;/p&gt;
&lt;p&gt;Our disambiguation rules are:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;2SG&lt;/b&gt;:-ης or
      &lt;b&gt;3SG&lt;/b&gt;:-η
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-5&lt;/b&gt; if lemma ends in -ω&lt;br&gt;
      &lt;b&gt;IA-9&lt;/b&gt; if lemma is ἵστημι&lt;br&gt;
      &lt;b&gt;IA-9b&lt;/b&gt; if lemma is φημί
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1PL&lt;/b&gt;:-αμεν or
      &lt;b&gt;2PL&lt;/b&gt;:-ατε or
      &lt;b&gt;3PL&lt;/b&gt;:-ασαν
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-9&lt;/b&gt; if lemma is ἵστημι&lt;br&gt;
      &lt;b&gt;IA-9b&lt;/b&gt; if lemma is φημί
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;2PL&lt;/b&gt;:-ετε
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-1&lt;/b&gt; if lemma ends in -ω&lt;br&gt;
      &lt;b&gt;IA-7&lt;/b&gt; if lemma ends in -μι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;2SG&lt;/b&gt;:-εις or
      &lt;b&gt;3SG&lt;/b&gt;:-ει
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-2&lt;/b&gt; if lemma ends in -ω&lt;br&gt;
      &lt;b&gt;IA-7&lt;/b&gt; if lemma ends in -μι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
    &lt;b&gt;1PL&lt;/b&gt;:-οῦμεν or
    &lt;b&gt;3PL&lt;/b&gt;:-ουν
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-2&lt;/b&gt; if lemma ends in -έω&lt;br&gt;
      &lt;b&gt;IA-3&lt;/b&gt; if lemma ends in -όω
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:-ων or
      &lt;b&gt;1PL&lt;/b&gt;:-ῶμεν or
      &lt;b&gt;3PL&lt;/b&gt;:-ων
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-5&lt;/b&gt; if lemma is ζάω (should really just lemmatise ζήω)&lt;br&gt;
      &lt;b&gt;IA-4&lt;/b&gt; if lemma is other -άω
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;2SG&lt;/b&gt;:-υς or
      &lt;b&gt;3SG&lt;/b&gt;:-υ
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-3&lt;/b&gt; if lemma ends in -ω&lt;br&gt;
      &lt;b&gt;IA-6a&lt;/b&gt; if lemma ends in -υμι (or if form not -ους / -ου)&lt;br&gt;
      &lt;b&gt;IA-8&lt;/b&gt; if lemma ends in -ωμι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:-υν
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-2&lt;/b&gt; if lemma ends in -έω&lt;br&gt;
      &lt;b&gt;IA-3&lt;/b&gt; if lemma ends in -όω&lt;br&gt;
      &lt;b&gt;IA-6a&lt;/b&gt; if lemma ends in -υμι (or if form not -ουν)&lt;br&gt;
      &lt;b&gt;IA-8&lt;/b&gt; if lemma ends in -ωμι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1PL&lt;/b&gt;:-ομεν
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-1&lt;/b&gt; if lemma ends in -ω&lt;br&gt;
      &lt;b&gt;IA-8&lt;/b&gt; if lemma ends in -μι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:-ειν or
      &lt;b&gt;3PL&lt;/b&gt;:-εσαν
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-7&lt;/b&gt; if lemma is τίθημι or ἵημι (not an issue in SBLGNT)&lt;br&gt;
      &lt;b&gt;IA-11&lt;/b&gt; if lemma is εἶμι&lt;br&gt;
      &lt;b&gt;IA-11-COMP&lt;/b&gt; if lemma ends in -ειμι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;2SG&lt;/b&gt;:-εις
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-2&lt;/b&gt; if lemma ends in -ω (not an issue in SBLGNT)&lt;br&gt;
      &lt;b&gt;IA-7&lt;/b&gt; if lemma is τίθημι or ἵημι (not an issue in SBLGNT)&lt;br&gt;
      &lt;b&gt;IA-11&lt;/b&gt; if lemma is εἶμι&lt;br&gt;
      &lt;b&gt;IA-11-COMP&lt;/b&gt; if lemma ends in -ειμι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:-ην
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-7&lt;/b&gt; if lemma is τίθημι or ἵημι&lt;br&gt;
      &lt;b&gt;IA-9&lt;/b&gt; if lemma is ἵστημι&lt;br&gt;
      &lt;b&gt;IA-9b&lt;/b&gt; if lemma is φημί
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;2PL&lt;/b&gt;:-ῆτε
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;IA-5&lt;/b&gt; if lemma ends in -ω&lt;br&gt;
      &lt;b&gt;IA-10-COMP&lt;/b&gt; if lemma ends in -μι (not an issue in SBLGNT)
    &lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Encapsulating these rules into a Python script and running on our data, we now have an inflectional class for all 1,344 imperfect active indicative forms in the MorphGNT SBLGNT.&lt;/p&gt;
&lt;p&gt;The output of my Python script looks like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;010118  ἦν  3SG IA-10   εἰμί    IA-10   ἦν  _   ἦν
010125  ἐγίνωσκε(ν) 3SG IA-1    γινώσκω IA-1    Xε(ν)   ἐγίνωσκ ε(ν)
010209  προῆγε(ν)   3SG IA-1    προάγω  IA-1    Xε(ν)   προῆγ   ε(ν)
010209  ἦν  3SG IA-10   εἰμί    IA-10   ἦν  _   ἦν
010215  ἦν  3SG IA-10   εἰμί    IA-10   ἦν  _   ἦν
010218  ἤθελε(ν)    3SG IA-1    θέλω    IA-1    Xε(ν)   ἤθελ    ε(ν)
010304  εἶχε(ν) 3SG IA-1    ἔχω IA-1    Xε(ν)   εἶχ ε(ν)
010304  ἦν  3SG IA-10   εἰμί    IA-10   ἦν  _   ἦν
010314  διεκώλυε(ν) 3SG IA-1    διακωλύω    IA-1    Xε(ν)   διεκώλυ ε(ν)
010407  ἔφη 3SG IA-5/IA-9/IA-9b φημί    IA-9b   Xη  ἔφ  η
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The columns are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the book/chapter/verse reference&lt;/li&gt;
&lt;li&gt;the normalized form&lt;/li&gt;
&lt;li&gt;the morphosyntactic properties&lt;/li&gt;
&lt;li&gt;the inflectional classes possible without disambiguation&lt;/li&gt;
&lt;li&gt;the lemma&lt;/li&gt;
&lt;li&gt;the disambiguated inflectional class&lt;/li&gt;
&lt;li&gt;the distinguisher pattern&lt;/li&gt;
&lt;li&gt;the theme (the value of X)&lt;/li&gt;
&lt;li&gt;the distinguisher&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can download the entire thing &lt;a href=&#34;https://gist.github.com/jtauber/7fd4747002684a04f12f928ae769ec40&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We&#39;ll use this to do our counts in the next post.&lt;/p&gt;
&lt;p&gt;But before that, there are a couple of things we can check.&lt;/p&gt;
&lt;p&gt;Firstly, &lt;strong&gt;are the disambiguated inflectional classes consistent for each lexeme?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There are five exceptions, all of which we raised in the previous post:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἔχω is variously &lt;strong&gt;IA-1&lt;/strong&gt; or &lt;strong&gt;IA-8&lt;/strong&gt; (the alternate εἴχοσαν for the &lt;strong&gt;3PL&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;ἐρωτάω is variously &lt;strong&gt;IA-4&lt;/strong&gt; or &lt;strong&gt;IA-2&lt;/strong&gt; (the alternate ἠρώτουν for the &lt;strong&gt;3PL&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;δίδωμι is variously &lt;strong&gt;IA-8&lt;/strong&gt; or &lt;strong&gt;IA-3&lt;/strong&gt; (the alternate ἐδίδουν for the &lt;strong&gt;3PL&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;παραδίδωμι is variously &lt;strong&gt;IA-8&lt;/strong&gt; or &lt;strong&gt;IA-3&lt;/strong&gt; (the alternative παρεδίδουν for the &lt;strong&gt;3PL&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;τίθημι is variously &lt;strong&gt;IA-7&lt;/strong&gt; or &lt;strong&gt;IA-2&lt;/strong&gt; (the alternative ἐτίθουν for the &lt;strong&gt;3PL&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Notice they are all in the &lt;strong&gt;3PL&lt;/strong&gt; and, with the exception of the ἐρωτάω case are alternations between the thematic and athematic ending.&lt;/p&gt;
&lt;p&gt;Secondly, &lt;strong&gt;is the value of X in our paradigm patterns consistent across a lexeme?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There seem to be four exceptions, three of which are to do with the augment:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;εὑρίσκω: ηὑρισκ or εὑρισκ&lt;/li&gt;
&lt;li&gt;μέλλω: ἠμελλ or ἐμελλ&lt;/li&gt;
&lt;li&gt;εὐκαιρέω: εὐκαιρ or ηὐκαιρ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So far we&#39;ve glossed over the augment but we shall look at it in detail in a future post.&lt;/p&gt;
&lt;p&gt;There is also
* συγχέω: συνεχε or συνεχυνν&lt;/p&gt;
&lt;p&gt;which we previously brought up. This is not just an inflectional class difference but also a stem formation difference. We&#39;ll talk a bit more about this in future posts, but for now it&#39;s probably best though of as two distinct lemmas that are conventionally conflated under the single headword συγχέω, Notice also that συνέχεον is the one example of an uncontracted &lt;strong&gt;IA-2&lt;/strong&gt; in the SBLGNT.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 30</title>
    <link href="https://jktauber.com/2019/12/12/a-tour-of-greek-morphology-part-30/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 30"/>
    <published>2019-12-12T02:26:11-05:00</published>
    <updated>2019-12-12T02:26:11-05:00</updated>
    <id>https://jktauber.com/2019/12/12/a-tour-of-greek-morphology-part-30</id>
    <content type="html" xml:base="https://jktauber.com/2019/12/12/a-tour-of-greek-morphology-part-30/">&lt;p&gt;Part thirty of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;To complete the imperfect active indicatives, there are just a few more tweaks we need to make.&lt;/p&gt;
&lt;p&gt;Firstly, we need to add the compound versions of &lt;strong&gt;IA-10&lt;/strong&gt; and &lt;strong&gt;IA-11&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Secondly, because the form ἦστε (for the &lt;strong&gt;2PL&lt;/strong&gt; of ἦν) appears in one of our test grammars, we need to add that to &lt;strong&gt;IA-10&lt;/strong&gt;. More on that whole paradigm later.&lt;/p&gt;
&lt;p&gt;Thirdly, let&#39;s just rename &lt;strong&gt;IA-6&lt;/strong&gt; to &lt;strong&gt;IA-6a&lt;/strong&gt; for consistency with how we named the present once we decided to include the υ.&lt;/p&gt;
&lt;p&gt;That results in this update to &lt;strong&gt;IA-6a&lt;/strong&gt; through &lt;strong&gt;IA-11-COMP&lt;/strong&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-6a&lt;/th&gt;
&lt;th&gt;IA-7&lt;/th&gt;
&lt;th&gt;IA-8&lt;/th&gt;
&lt;th&gt;IA-9&lt;/th&gt;
&lt;th&gt;IA-9b&lt;/th&gt;
&lt;th&gt;IA-10&lt;/th&gt;
&lt;th&gt;IA-10-COMP&lt;/th&gt;
&lt;th&gt;IA-11&lt;/th&gt;
&lt;th&gt;IA-11-COMP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῡν&lt;/td&gt;
&lt;td&gt;Xην/Xειν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xην&lt;/td&gt;
&lt;td&gt;Xην&lt;/td&gt;
&lt;td&gt;ἦ/ἦν&lt;/td&gt;
&lt;td&gt;Xῆ/Xῆν&lt;/td&gt;
&lt;td&gt;ᾖα/ᾔειν&lt;/td&gt;
&lt;td&gt;Xῇα/Xῄειν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῡς&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xους&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;td&gt;Xης/Xησθα&lt;/td&gt;
&lt;td&gt;ἦς/ἦσθα&lt;/td&gt;
&lt;td&gt;Xῆς/Xῆσθα&lt;/td&gt;
&lt;td&gt;ᾔεις/ᾔεισθα&lt;/td&gt;
&lt;td&gt;Xῄεις/Xῄεισθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῡ&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;td&gt;ἦν&lt;/td&gt;
&lt;td&gt;Xῆν&lt;/td&gt;
&lt;td&gt;ᾔει/ᾔειν&lt;/td&gt;
&lt;td&gt;Xῄει/Xῄειν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῠμεν&lt;/td&gt;
&lt;td&gt;Xεμεν&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xᾰμεν&lt;/td&gt;
&lt;td&gt;Xᾰμεν&lt;/td&gt;
&lt;td&gt;ἦμεν&lt;/td&gt;
&lt;td&gt;Xῆμεν&lt;/td&gt;
&lt;td&gt;ᾖμεν&lt;/td&gt;
&lt;td&gt;Xῇμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῠτε&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xοτε&lt;/td&gt;
&lt;td&gt;Xᾰτε&lt;/td&gt;
&lt;td&gt;Xᾰτε&lt;/td&gt;
&lt;td&gt;ἦτε/ἦστε&lt;/td&gt;
&lt;td&gt;Xῆτε/Xῆστε&lt;/td&gt;
&lt;td&gt;ᾖτε&lt;/td&gt;
&lt;td&gt;Xῇτε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῠσᾰν&lt;/td&gt;
&lt;td&gt;Xεσᾰν&lt;/td&gt;
&lt;td&gt;Xοσᾰν&lt;/td&gt;
&lt;td&gt;Xᾰσᾰν&lt;/td&gt;
&lt;td&gt;Xᾰσᾰν&lt;/td&gt;
&lt;td&gt;ἦσᾰν&lt;/td&gt;
&lt;td&gt;Xῆσᾰν&lt;/td&gt;
&lt;td&gt;ᾖσᾰν/ᾔεσᾰν&lt;/td&gt;
&lt;td&gt;Xῇσᾰν/Xῄεσᾰν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;And just for completeness, here&#39;s the rest:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-1&lt;/th&gt;
&lt;th&gt;IA-2&lt;/th&gt;
&lt;th&gt;IA-3&lt;/th&gt;
&lt;th&gt;IA-4&lt;/th&gt;
&lt;th&gt;IA-5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xον&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xες&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xους&lt;/td&gt;
&lt;td&gt;Xᾱς&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xε(ν)&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xᾱ&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xεῖτε&lt;/td&gt;
&lt;td&gt;Xοῦτε&lt;/td&gt;
&lt;td&gt;Xᾶτε&lt;/td&gt;
&lt;td&gt;Xῆτε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xον&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Does this handle all the forms in the MorphGNT plus our test grammars?&lt;/p&gt;
&lt;p&gt;Almost.&lt;/p&gt;
&lt;p&gt;In Romans 3.13, we find ἐδολιοῦσαν, which does not match any of our patterns. What is happening here?&lt;/p&gt;
&lt;p&gt;We have a contraction (suggesting &lt;strong&gt;IA-2&lt;/strong&gt; or &lt;strong&gt;IA-3&lt;/strong&gt; but, as indicated in the lemma, it&#39;s an &lt;strong&gt;IA-3&lt;/strong&gt;) but with a -σᾰν ending like we would expect in an athematic verb. Because the contraction would normally only happen with a theme-vowel, we don&#39;t expect to see both -οῦ- and -σαν together.&lt;/p&gt;
&lt;p&gt;If you look at &lt;strong&gt;IA-3&lt;/strong&gt; and &lt;strong&gt;IA-8&lt;/strong&gt; you can see they are indistinguishable in the singular. In fact &lt;strong&gt;IA-8&lt;/strong&gt; is acting like a thematic verb in the singular so there was already a merger happening between the classes. Further confusion about which endings to use in the plural makes sense, although here we have an interesting combination of distinguishers, combining the -οῦ- we might expect in an &lt;strong&gt;IA-3&lt;/strong&gt; plural with with the -σᾰν we expect in an athematic plural.&lt;/p&gt;
&lt;p&gt;It&#39;s worth pointing out that particular phenomenon is fairly common in the Septuagint and Romans 3.13 is a quote from the Septuagint. We can&#39;t know for sure if Paul would normally have written ἐδολίουν instead but we can speculate that it&#39;s like an American writer keeping British spelling in a quotation of a British author.&lt;/p&gt;
&lt;p&gt;In our data, that&#39;s the only form that fails to match. But there are others that exhibit a similar phenomenon that we should collect for completeness.&lt;/p&gt;
&lt;p&gt;Twice in John 15 we find εἴχοσαν where we might expect εἶχον (and indeed do find outside of John). This is again common in the LXX.&lt;/p&gt;
&lt;p&gt;More broadly (and not particularly characteristic of the LXX) is the replacement of athematic verbs with a thematic equivalent.&lt;/p&gt;
&lt;p&gt;Twice in Acts we find ἐτίθουν where we would expect ἐτίθεσαν (acting like an &lt;strong&gt;IA-2&lt;/strong&gt; τιθέω).&lt;/p&gt;
&lt;p&gt;Also in Acts we find ἀπεδίδουν (acting like an &lt;strong&gt;IA-3&lt;/strong&gt; διδόω) and both παρεδίδουν (Acts 27.1) and παρεδίδοσαν (Acts 16.4).&lt;/p&gt;
&lt;p&gt;These athematic verbs are inflecting as if they were thematic. Note this actually causes a &lt;strong&gt;1SG&lt;/strong&gt; / &lt;strong&gt;3PL&lt;/strong&gt; ambiguity that wouldn&#39;t otherwise exist.&lt;/p&gt;
&lt;p&gt;There are other examples of athematic verbs inflecting as if thematic:&lt;/p&gt;
&lt;p&gt;John 21.18 has ἐζώννυες with a theme vowel (becoming thematic &lt;strong&gt;IA-1&lt;/strong&gt; instead of &lt;strong&gt;IA-6a&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;Matthew 21.8 has ἐστρώννυον with theme vowel (becoming thematic &lt;strong&gt;IA-1&lt;/strong&gt; instead of &lt;strong&gt;IA-6a&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;Twice in Mark we find ἤφιε(ν) (becoming thematic &lt;strong&gt;IA-1&lt;/strong&gt; instead of &lt;strong&gt;IA-7&lt;/strong&gt; where we&#39;d expect ἠφίει).&lt;/p&gt;
&lt;p&gt;And, in different categories:&lt;/p&gt;
&lt;p&gt;Acts 21.27 has uncontracted &lt;strong&gt;3PL&lt;/strong&gt; συνέχεον.&lt;/p&gt;
&lt;p&gt;Acts 9.22 has &lt;strong&gt;3SG&lt;/strong&gt; συνέχυννεν as if the lemma were συγχύννω (and I&#39;m tempted to, in fact, lemmatise it that way).&lt;/p&gt;
&lt;p&gt;We also have cases of confusion between -αω and -εω verbs (which long happened in Greek dialects). ἠρώτουν in Matthew 15.23 looks like an &lt;strong&gt;IA-2&lt;/strong&gt; (or &lt;strong&gt;IA-3&lt;/strong&gt;) even though ἠρώτα and ἠρώτων elsewhere suggest &lt;strong&gt;IA-4&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Unlike the usual confusions between inflectional classes we&#39;ve seen above, though, there are no distinguisher patterns shared between &lt;strong&gt;IA-2&lt;/strong&gt; and &lt;strong&gt;IA-4&lt;/strong&gt; so the underlying cause is different.&lt;/p&gt;
&lt;p&gt;A few other points to raise to round out the full set of imperfect active forms in our data:&lt;/p&gt;
&lt;p&gt;In MorphGNT there is an &lt;a href=&#34;https://github.com/morphgnt/sblgnt/issues/30&#34;&gt;open issue&lt;/a&gt; about ἔστηκεν in John 8.44. MorphGNT currently analyses it as an imperfect (it would be the imperfect of στήκω) but with the lemma ἵστημι (which would have a &lt;em&gt;perfect&lt;/em&gt; of ἕστηκεν with rough breathing). This needs to be resolved in MorphGNT so doesn&#39;t really effect our analysis of imperfect active forms here but I thought I&#39;d mention it.&lt;/p&gt;
&lt;p&gt;Another issue that should be considered in MorphGNT is εἰσῄει in Acts 21 should possibly be normalised with the movable nu as it&#39;s a &lt;strong&gt;IA-11-COMP&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In the next post, we&#39;ll go through resolving any remaining ambiguities in the imperfect active forms.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirty of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Release of text-validator 0.3</title>
    <link href="https://jktauber.com/2019/12/03/release-of-textvalidator-03/" rel="alternate" type="text/html" title="Release of text-validator 0.3"/>
    <published>2019-12-03T23:28:37-05:00</published>
    <updated>2019-12-03T23:28:37-05:00</updated>
    <id>https://jktauber.com/2019/12/03/release-of-textvalidator-03</id>
    <content type="html" xml:base="https://jktauber.com/2019/12/03/release-of-textvalidator-03/">&lt;p&gt;A few weeks ago, I &lt;a href=&#34;{% post_url 2019-11-11-release-of-textvalidator-01 %}&#34;&gt;announced&lt;/a&gt; the first release of &lt;code&gt;text-validator&lt;/code&gt;, my pluggable command-line tool for validating the formatting and orthography of text files.&lt;/p&gt;
&lt;p&gt;Since then I&#39;ve done a couple of small updates.&lt;/p&gt;
&lt;p&gt;In 0.2, I added a validator plugin to test tokens against a list of regular expressions. This is great for catching stray characters.&lt;/p&gt;
&lt;p&gt;For example, here&#39;s the configuration I use for my text of the Enchiridion:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code class=&#34;language-toml&#34;&gt;TOKEN_REGEXES = [
    &amp;quot;\\d+\\.\\d+$&amp;quot;,
    &amp;quot;[«(]*[\u0370-\u03FF\u1F00-\u1FFF]+\u2019?[.,:;»)]*$&amp;quot;,
]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In 0.3, I made a small but significant change: the tool now returns a non-zero status code if validation fails. This doesn&#39;t make much difference if you&#39;re just running the tool manually on the command-line but if you&#39;re running it as part of a continuous integration (CI) process, this is vital.&lt;/p&gt;
&lt;p&gt;With the 0.3 change, I was able to set up a GitHub Action on both the &lt;a href=&#34;https://github.com/jtauber/apostolic-fathers&#34;&gt;apostolic-fathers&lt;/a&gt; and &lt;a href=&#34;https://github.com/jtauber/enchiridion&#34;&gt;enchridion&lt;/a&gt; repos to automatically run &lt;code&gt;text-validator&lt;/code&gt; any time there is a new push or pull request.&lt;/p&gt;
&lt;p&gt;You can read more about &lt;code&gt;text-validator&lt;/code&gt; and how to use it at https://github.com/jtauber/text-validator and the linked-to wiki pages.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A few weeks ago, I &lt;a href=&#34;{% post_url 2019-11-11-release-of-textvalidator-01 %}&#34;&gt;announced&lt;/a&gt; the first release of &lt;code&gt;text-validator&lt;/code&gt;, my pluggable command-line tool for validating the formatting and orthography of text files.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 29</title>
    <link href="https://jktauber.com/2019/11/29/a-tour-of-greek-morphology-part-29/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 29"/>
    <published>2019-11-29T03:40:34-05:00</published>
    <updated>2019-11-29T03:40:34-05:00</updated>
    <id>https://jktauber.com/2019/11/29/a-tour-of-greek-morphology-part-29</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/29/a-tour-of-greek-morphology-part-29/">&lt;p&gt;Part twenty-nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In this post, we review the imperfect middle distinguishers in much the same way as we did the imperfect actives in &lt;a href=&#34;{% post_url 2019-04-30-tour-greek-morphology-part-28 %}&#34;&gt;Part 28&lt;/a&gt; and the present middles in &lt;a href=&#34;{% post_url 2017-08-29-tour-greek-morphology-part-14 %}&#34;&gt;Part 14&lt;/a&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-1&lt;/th&gt;
&lt;th&gt;IM-2&lt;/th&gt;
&lt;th&gt;IM-3&lt;/th&gt;
&lt;th&gt;IM-4&lt;/th&gt;
&lt;th&gt;IM-5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xόμην&lt;/td&gt;
&lt;td&gt;Xούμην&lt;/td&gt;
&lt;td&gt;Xούμην&lt;/td&gt;
&lt;td&gt;Xώμην&lt;/td&gt;
&lt;td&gt;Xώμην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xοῦ&lt;/td&gt;
&lt;td&gt;Xοῦ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xετο&lt;/td&gt;
&lt;td&gt;Xεῖτο&lt;/td&gt;
&lt;td&gt;Xοῦτο&lt;/td&gt;
&lt;td&gt;Xᾶτο&lt;/td&gt;
&lt;td&gt;Xῆτο&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xεῖσθε&lt;/td&gt;
&lt;td&gt;Xοῦσθε&lt;/td&gt;
&lt;td&gt;Xᾶσθε&lt;/td&gt;
&lt;td&gt;Xῆσθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xοντο&lt;/td&gt;
&lt;td&gt;Xοῦντο&lt;/td&gt;
&lt;td&gt;Xοῦντο&lt;/td&gt;
&lt;td&gt;Xῶντο&lt;/td&gt;
&lt;td&gt;Xῶντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-6&lt;/th&gt;
&lt;th&gt;IM-7&lt;/th&gt;
&lt;th&gt;IM-8&lt;/th&gt;
&lt;th&gt;IM-9&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xύμην&lt;/td&gt;
&lt;td&gt;Xέμην&lt;/td&gt;
&lt;td&gt;Xόμην&lt;/td&gt;
&lt;td&gt;Xάμην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυσο&lt;/td&gt;
&lt;td&gt;Xεσο&lt;/td&gt;
&lt;td&gt;Xοσο&lt;/td&gt;
&lt;td&gt;Xασο/Xω&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυτο&lt;/td&gt;
&lt;td&gt;Xετο&lt;/td&gt;
&lt;td&gt;Xοτο&lt;/td&gt;
&lt;td&gt;Xατο&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xύμεθα&lt;/td&gt;
&lt;td&gt;Xέμεθα&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xάμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυσθε&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xοσθε&lt;/td&gt;
&lt;td&gt;Xασθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυντο&lt;/td&gt;
&lt;td&gt;Xεντο&lt;/td&gt;
&lt;td&gt;Xοντο&lt;/td&gt;
&lt;td&gt;Xαντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;and if we capture the common elements in each row:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-1&lt;/th&gt;
&lt;th&gt;IM-2&lt;/th&gt;
&lt;th&gt;IM-3&lt;/th&gt;
&lt;th&gt;IM-4&lt;/th&gt;
&lt;th&gt;IM-5&lt;/th&gt;
&lt;th&gt;IM-6&lt;/th&gt;
&lt;th&gt;IM-7&lt;/th&gt;
&lt;th&gt;IM-8&lt;/th&gt;
&lt;th&gt;IM-9&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;td&gt;-μην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-{ο}&lt;/td&gt;
&lt;td&gt;-{ο}&lt;/td&gt;
&lt;td&gt;-{ο}&lt;/td&gt;
&lt;td&gt;-{ο}&lt;/td&gt;
&lt;td&gt;-{ο}&lt;/td&gt;
&lt;td&gt;-σο&lt;/td&gt;
&lt;td&gt;-σο&lt;/td&gt;
&lt;td&gt;-σο&lt;/td&gt;
&lt;td&gt;-σο/-{ο}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;td&gt;-το&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;td&gt;-ντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Just as with the present middles, other than the contraction happening in &lt;strong&gt;2SG&lt;/strong&gt; (in this case obscuring the historical σο), there is no difference between the thematic and athematic endings.&lt;/p&gt;
&lt;p&gt;As with the other paradigms we&#39;ve seen, some cells across inflectional classes have identical distinguishers and so those cells alone can&#39;t identify the inflectional class (and hence all the other forms in that class). In particular:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, and &lt;strong&gt;3PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;IM-1&lt;/strong&gt;, &lt;strong&gt;IM-8&lt;/strong&gt;} or within the set {&lt;strong&gt;IM-2&lt;/strong&gt;, &lt;strong&gt;IM-3&lt;/strong&gt;} or within the set {&lt;strong&gt;IM-4&lt;/strong&gt;, &lt;strong&gt;IM-5&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;IM-1&lt;/strong&gt;, &lt;strong&gt;IM-7&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;2SG&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;IM-2&lt;/strong&gt;, &lt;strong&gt;IM-3&lt;/strong&gt;} or within the set {&lt;strong&gt;IM-4&lt;/strong&gt;, &lt;strong&gt;IM-5&lt;/strong&gt;}.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Or to flip it around:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;classes&lt;/th&gt;
&lt;th&gt;characteristics&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM&lt;/strong&gt;-{&lt;strong&gt;1&lt;/strong&gt;, &lt;strong&gt;7&lt;/strong&gt;}&lt;/td&gt;
&lt;td&gt;ε in &lt;strong&gt;3SG&lt;/strong&gt;, &lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM&lt;/strong&gt;-{&lt;strong&gt;1&lt;/strong&gt;, &lt;strong&gt;8&lt;/strong&gt;}&lt;/td&gt;
&lt;td&gt;ό in &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;; ο in &lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM&lt;/strong&gt;-{&lt;strong&gt;2&lt;/strong&gt;, &lt;strong&gt;3&lt;/strong&gt;}&lt;/td&gt;
&lt;td&gt;ού in &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;; οῦ in &lt;strong&gt;2SG&lt;/strong&gt;, &lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IM&lt;/strong&gt;-{&lt;strong&gt;4&lt;/strong&gt;, &lt;strong&gt;5&lt;/strong&gt;}&lt;/td&gt;
&lt;td&gt;ώ in &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;; ῶ in &lt;strong&gt;2SG&lt;/strong&gt;, &lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Notice, that &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, and &lt;strong&gt;3PL&lt;/strong&gt; are the ones with a theme vowel in -ο- and &lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; are the ones with a theme vowel in -ε-. There is, of course, nothing magical about this. Cells with an omicron theme vowel will fall together with athematic stems ending in omicron and cells with an epsilon theme vowel will fall together with athematic stems ending in epsilon.&lt;/p&gt;
&lt;p&gt;But notice also that cells that fall together because of contraction with an omicron theme vowel will be distinct in contractions with an epsilon theme vowel and vice-versa.&lt;/p&gt;
&lt;p&gt;That means that you just need a cell for a person-number that takes an omicron theme vowel and a cell for a person-number that takes an epsilon theme vowel the two of them are enough to give you the inflectional class. In this sense, the ablaut of the theme vowel actually works to counteract the ambiguity caused by contraction.&lt;/p&gt;
&lt;p&gt;This is the sort of systemic view of morphology that I think is very important, rather than just thinking of things in terms of parts of words being combined together.&lt;/p&gt;
&lt;p&gt;In an upcoming post we&#39;ll check whether this covers all the imperfect middles in MorphGNT.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Mounce on Ablaut (Or Not)</title>
    <link href="https://jktauber.com/2019/11/21/mounce-on-ablaut-or-not/" rel="alternate" type="text/html" title="Mounce on Ablaut (Or Not)"/>
    <published>2019-11-21T13:23:33-05:00</published>
    <updated>2019-11-21T13:23:33-05:00</updated>
    <id>https://jktauber.com/2019/11/21/mounce-on-ablaut-or-not</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/21/mounce-on-ablaut-or-not/">&lt;p&gt;Mounce’s &lt;em&gt;Basics of Biblical Greek Grammar&lt;/em&gt; is a very popular modern textbook, with over 400,000 copies sold and now in its fourth edition. There’s a lot one could quibble with around the usual suspects of deponency, aspect, or the general grammar-translation approach but it’s particularly odd when basic (and, as far as I know, uncontroversial) terminology is misused or misunderstood. I’m talking in particular about the way “ablaut” is discussed.&lt;/p&gt;
&lt;p&gt;Here’s one of his Eight Noun Rules on page 422 (BBG 4th Edition):&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/images/mounce_ablaut.jpg&#34;&gt;&lt;/p&gt;
&lt;p&gt;There are several problems here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;“ablaut” is not simply “vowels changing their length”&lt;/li&gt;
&lt;li&gt;“contraction” is definitely not a form of ablaut&lt;/li&gt;
&lt;li&gt;“compensatory lengthening” is definitely not a form of ablaut either (certainly not in the example given, but see &lt;strong&gt;UPDATE&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;So What is Ablaut?&lt;/h2&gt;
&lt;p&gt;More generally, &lt;strong&gt;vowel gradation&lt;/strong&gt; is a grammatical alternation expressed via a vowel change. In English, sing ~ sang ~ sung expresses a contrast in tense-aspect; sing ~ song expresses a contrast in part of speech; foot ~ feet expresses a contrast in number.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ablaut&lt;/strong&gt; is a specific type of vowel gradation in Proto-Indo-European and many of its daughter languages. In PIE, it involved alternations between ∅ ~ e ~ o ~ ē ~ ō that were related, in part, to accentuation (∅ stands for the absence of the vowel, also called the &#39;zero grade&#39;).&lt;/p&gt;
&lt;p&gt;Alternations in English like sing ~ sang ~ sung and sing ~ song can be traced to PIE ablaut. However, alternations like foot ~ feet (or man ~ men) are a different type of vowel gradation caused historically in Germanic languages by an -i that later dropped. This is called &lt;strong&gt;umlaut&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Because Greek had ablaut but not umlaut, the term “vowel gradation” is often used synonymously with “ablaut” when talking entirely about Greek. But technically ablaut is just one type of vowel gradation.&lt;/p&gt;
&lt;p&gt;Ablaut is behind Greek alternations like λέγω ~ λόγος, πατήρ ~ πατέρα ~ πατρός, the theme vowel changes in -ο-μεν ~ -ε-τε, and the stem change in λείπ-ω ~ ἔ-λιπ-ον ~ λέ-λοιπ-α.&lt;/p&gt;
&lt;p&gt;Contraction and compensatory lengthening are unrelated processes in Greek. They do involve vowel change, but not as part of a grammatical alternation.&lt;/p&gt;
&lt;h2&gt;Does Mounce Say Anything More About Ablaut?&lt;/h2&gt;
&lt;p&gt;On page 58 he again gives λογο + ι ➝ λογῷ as his example, describing “ablaut” as the “technical term” for the vowel length change. He further elaborates, saying that vowels can shorten (ω ➝ ο) or lengthen (ο ➝ ω) or disappear entirely. This explanation is somewhat okay, although λογῷ is bad example. Then in a footnote he talks about compensatory lengthening and says “this is a form of ablaut”. This is incorrect and misleading, because compensatory lengthening is not a form of ablaut or even vowel gradation. Ablaut (and vowel gradation) requires an alternation where the vowel difference signals something different grammatically.&lt;/p&gt;
&lt;p&gt;On page 132 he describes the vocative singular as the bare stem “sometimes with the stem vowel being changed (ablaut)“. So he gets it right there even though the obvious example of second declension masculine vocatives in -ε (alternating with the ο in the nominative) contradicts his definition of ablaut as “vowels changing their length”.&lt;/p&gt;
&lt;p&gt;On page 216, he says that “liquid roots (λ, μ, ν, ρ) are generally used without modification (except for ablaut)“. This is also a valid use of the term ablaut, but note that ablaut between verb stems is not just limited to liquid roots (I gave the example λείπ-ω ~ ἔ-λιπ-ον ~ λέ-λοιπ-α above).&lt;/p&gt;
&lt;h2&gt;Does This Actually Matter?&lt;/h2&gt;
&lt;p&gt;In terms of language acquisition, hundreds of millions of English speakers get by fine without knowing that sing ~ sang ~ sung is ablaut. Similarly it’s perfectly possible to learn Greek without even being consciously aware of all the alternations much less putting a label on them.&lt;/p&gt;
&lt;p&gt;In this sense, students using BBG are hardly hurt by Mounce’s incorrect (or at the very least misleading) use of the term “ablaut”.&lt;/p&gt;
&lt;p&gt;That said, Mounce&#39;s textbook is a grammar and is used as students&#39; first introduction to Greek grammar in particular. Students should not be introduced to technical terminology in ways that are incorrect, and will require unlearning later on. Ablaut is neither a recent term (Jacob Grimm coined it 200 years ago and the concept was known to the Sanskrit grammarians two &lt;em&gt;millennia&lt;/em&gt; ago) nor is it a contested term.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Thanks to Seumas Macdonald for his feedback on a draft of this post.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: &#34;I thought that πατήρ ~ πατέρα was compensatory lengthening from loss of sigma in nominative singular&#34;. Some Greek grammars explain it this way and it&#39;s potentially half-right. Many Indo-Europeanists (but not all) think that at an earlier stage, PIE *ph&lt;sub&gt;2&lt;/sub&gt;-tḗr (from where πατήρ came) was probably *ph&lt;sub&gt;2&lt;/sub&gt;-tér-s, thus keeping the nominative ending consistently -s at that earlier stage. This change *-VRs &amp;gt; *-VːR (where V is a vowel, Vː is a long vowel, and R is a resonant like r or n) is known as Szemerényi&#39;s law. While this &lt;em&gt;is&lt;/em&gt; broadly a type of compensatory lengthening it has no relationship to various compensatory lengthening processes in Greek itself and the three way alternation *ph&lt;sub&gt;2&lt;/sub&gt;-tḗr ~ *ph&lt;sub&gt;2&lt;/sub&gt;-tér-m̥ ~ *ph&lt;sub&gt;2&lt;/sub&gt;-tr-és already existed in PIE before Greek.&lt;/p&gt;
&lt;p&gt;Thus grammars that describe a three-way vowel grade alternation are correctly describing the situation in Greek. Those grammars that mention the possible earlier change in Indo-European should probably make it clear this not what is normally thought of as compensatory lengthening internal to Greek. And certainly none of this changes anything I&#39;ve said about Mounce&#39;s account.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Mounce’s &lt;em&gt;Basics of Biblical Greek Grammar&lt;/em&gt; is a very popular modern textbook, with over 400,000 copies sold and now in its fourth edition. There’s a lot one could quibble with around the usual suspects of deponency, aspect, or the general grammar-translation approach but it’s particularly odd when basic (and, as far as I know, uncontroversial) terminology is misused or misunderstood. I’m talking in particular about the way “ablaut” is discussed.</summary>
  </entry><entry>
    <title type="html">Dictionary Markup versus Lexical Modelling</title>
    <link href="https://jktauber.com/2019/11/15/dictionary-markup-versus-lexical-modelling/" rel="alternate" type="text/html" title="Dictionary Markup versus Lexical Modelling"/>
    <published>2019-11-15T07:49:38-05:00</published>
    <updated>2019-11-15T07:49:38-05:00</updated>
    <id>https://jktauber.com/2019/11/15/dictionary-markup-versus-lexical-modelling</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/15/dictionary-markup-versus-lexical-modelling/">&lt;p&gt;This year I&#39;ve been thinking about (and working on) the representation of lexical information quite a bit.&lt;/p&gt;
&lt;p&gt;This is nothing new, but recently, thoughts and activity have been motivated on multiple fronts including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;work earlier this year starting to &lt;a href=&#34;https://github.com/jtauber/cunliffe&#34;&gt;extract Homeric Greek information from Cunliffe&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;a new project to &lt;a href=&#34;https://github.com/digitaltolkien/a-middle-english-vocabulary&#34;&gt;digitise Tolkien&#39;s &lt;em&gt;A Middle-English Vocabulary&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;work collaborating on the &lt;a href=&#34;https://greekwordnet.chs.harvard.edu&#34;&gt;GreekWordNet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;contributions to the &lt;a href=&#34;https://www.w3.org/community/ontolex/&#34;&gt;Ontology-Lexica Community Group&lt;/a&gt; on modelling morphology (see a recent joint paper &lt;a href=&#34;https://elex.link/elex2019/wp-content/uploads/2019/09/eLex_2019_33.pdf&#34;&gt;Challenges for the Representation of Morphology in Ontology Lexicons&lt;/a&gt; from eLex 2019)&lt;/li&gt;
&lt;li&gt;gathering lexical information for the &lt;a href=&#34;{% post_url 2019-11-02-greek-texts-project %}&#34;&gt;Greek Texts&lt;/a&gt; project.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Plus my long-term vision of a comprehensive, machine actionable description of Greek morphology.&lt;/p&gt;
&lt;p&gt;One important distinction that comes up, though, and which I&#39;ve ranted about a number of times on Twitter.&lt;/p&gt;
&lt;blockquote class=&#34;twitter-tweet&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;I guess Abbott-Smith for GNT too. I&amp;#39;ve long been interested in the relationship (and mismatches) between printed dictionaries and the modelling of the information therein (especially beyond just the glosses/definitions; e.g. morphology, etymology)&lt;/p&gt;&amp;mdash; James Tauber (@jtauber) &lt;a href=&#34;https://twitter.com/jtauber/status/1162451605360431105?ref_src=twsrc%5Etfw&#34;&gt;August 16, 2019&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;

&lt;p&gt;In short, &lt;strong&gt;marking up a print dictionary is not the same as real modelling of lexical information&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Obviously the intention of a print dictionary is to convey that information but it is done so in a form ultimately only appropriate for human interpretation from the page (in print or an online facsimile or some sort). It&#39;s not really machine actionable.&lt;/p&gt;
&lt;p&gt;Now wait, you might think. All we need to do is markup the dictionary electronically using some format like TEI.&lt;/p&gt;
&lt;p&gt;Just to randomly pick an early entry in the conversion of Abbott-Smith to TEI:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code class=&#34;language-xml&#34;&gt;&amp;lt;entry lemma=&amp;quot;ἀγαθοποιός&amp;quot; strong=&amp;quot;G17&amp;quot;&amp;gt;
  &amp;lt;note type=&amp;quot;occurrencesNT&amp;quot;&amp;gt;1&amp;lt;/note&amp;gt;
  &amp;lt;form&amp;gt;**† &amp;lt;orth&amp;gt;ἀγαθοποιός&amp;lt;/orth&amp;gt;, &amp;lt;foreign xml:lang=&amp;quot;grc&amp;quot;&amp;gt;-όν&amp;lt;/foreign&amp;gt; = cl. &amp;lt;foreign xml:lang=&amp;quot;grc&amp;quot;&amp;gt;ἀγαθουργός&amp;lt;/foreign&amp;gt;,&amp;lt;/form&amp;gt;
  &amp;lt;etym&amp;gt;
    &amp;lt;seg type=&amp;quot;septuagint&amp;quot;&amp;gt;[in LXX, of a woman who deals pleasantly in order to corrupt, &amp;lt;ref osisRef=&amp;quot;Sir.42.14&amp;quot;&amp;gt;Si 42:14&amp;lt;/ref&amp;gt;*;]&amp;lt;/seg&amp;gt;
  &amp;lt;/etym&amp;gt;
  &amp;lt;sense&amp;gt;&amp;lt;gloss&amp;gt;doing well&amp;lt;/gloss&amp;gt;, &amp;lt;gloss&amp;gt;acting rightly&amp;lt;/gloss&amp;gt; (Plut.): &amp;lt;ref osisRef=&amp;quot;1Pet.2.14&amp;quot;&amp;gt;I Pe 2:14&amp;lt;/ref&amp;gt; (Cremer, 8; MM, &amp;lt;emph&amp;gt;VGT&amp;lt;/emph&amp;gt;, s.v.).†&amp;lt;/sense&amp;gt;
&amp;lt;/entry&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There&#39;s &lt;em&gt;some&lt;/em&gt; extractable information here, like the Strong&#39;s number, a clear lemma, some biblical references, the number of occurrences in the NT and some glosses. But some of the information goes unanalysed and merely presented as in the print dictionary. Things like the initial &lt;code&gt;**†&lt;/code&gt; in the entry are left unexplicated. The -όν termination indicating the inflectional class (and indirectly the part of speech) is merely marked up as a Greek word. The LXX reference is treated as etymology and yet the classical equivalent has to be decoded from the &lt;code&gt;= cl.&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This is not to pick specifically on the Abbott-Smith conversion. Marked-up versions of Cunliffe, LSJ, even a version I had of the Barclay Newman glossary in BetaCode in the mid-90s, are all primarily attempting to convey the typography of the printed work, sprinkling a little bit of descriptive rather than purely presentational markup over the original content (so you could at least use a stylesheet to decide to make headwords &lt;strong&gt;bold&lt;/strong&gt; or something, rather than doing it inline).&lt;/p&gt;
&lt;p&gt;It would still take a lot of work to extract morphological or etymological information from this type of markup.&lt;/p&gt;
&lt;p&gt;A very different kind of approach is to focus on actually modelling the lexical information and &lt;em&gt;only then&lt;/em&gt; worrying about mapping it to some visual presentation.&lt;/p&gt;
&lt;p&gt;TEI somewhat recognises the distinction and actually offers a variety of approaches to dictionary markup, one that is focused mostly on the markup for display (like in a printed book) and one that is more focused on the underlying data (although there are other formats for that too).&lt;/p&gt;
&lt;p&gt;Of course if you&#39;re marking up &lt;em&gt;an existing print dictionary&lt;/em&gt; you&#39;re pretty much doing the former. A more abstract modelling of the lexical information &lt;em&gt;in&lt;/em&gt; Abbott-Smith (or LSJ, or Cunliffe, or Tolkien) is no longer a marked up version of that dictionary.&lt;/p&gt;
&lt;p&gt;It&#39;s a challenging problem for sure (and I&#39;m certainly experiencing it firsthand on the Tolkien project—automatic extraction of even just the etymological information from the Middle English vocabulary entries has required tens of regular expressions).&lt;/p&gt;
&lt;p&gt;What we really need moving forward is more focus on the underlying lexical modelling and not assuming that marking up Abbott-Smith or the LSJ better is the solution. And it&#39;s not like there isn&#39;t a ton of work going on in good electronic representation of lexical information for modern languages.&lt;/p&gt;
&lt;p&gt;I ranted a little bit about the more general issue in my &lt;a href=&#34;{% post_url 2015-05-06-my-bibletech-2015-talk %}&#34;&gt;BibleTech 2015 talk on Biblical Greek Databases&lt;/a&gt; where I talked about how many reference books should be thought of (and indeed &lt;em&gt;produced&lt;/em&gt;) more as &#34;UI on top of databases&#34;. In other words, you focus on the data and then at some point have a largely automated process for generating printed works from that data. One of my examples in that talk was &#34;Readers Editions&#34; of texts with glossaries on each page. But I think it applies even more to dictionaries.&lt;/p&gt;
&lt;p&gt;So I think it&#39;s important that we recognise the distinction between:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the presentation of lexical information in a print dictionary (or online equivalent)&lt;/li&gt;
&lt;li&gt;the descriptive markup of those dictionaries&lt;/li&gt;
&lt;li&gt;the underlying linguistic information&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and recognise that collaboration on and exchange of the last of those three is ultimately the most valuable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: Here&#39;s an entry from the upcoming Cambridge lexicon. It&#39;s about as close as you could get to machine-actionable descriptive markup that is still pretty much following the structure of a print lexicon:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code class=&#34;language-xml&#34;&gt;&amp;lt;AE&amp;gt;
  &amp;lt;HG&amp;gt;
    &amp;lt;HL&amp;gt;ξανθό&amp;lt;hyph/&amp;gt;θριξ&amp;lt;/HL&amp;gt;
    &amp;lt;Infl&amp;gt;τριχος&amp;lt;/Infl&amp;gt;
    &amp;lt;PS&amp;gt;masc.fem.adj&amp;lt;/PS&amp;gt;
    &amp;lt;Ety&amp;gt;
      &amp;lt;Ref&amp;gt;θρίξ&amp;lt;/Ref&amp;gt;
    &amp;lt;/Ety&amp;gt;
  &amp;lt;/HG&amp;gt;
  &amp;lt;aS1&amp;gt;
    &amp;lt;Tr&amp;gt;fair&amp;lt;or/&amp;gt;golden-haired&amp;lt;/Tr&amp;gt;
    &amp;lt;Au&amp;gt; Sol. Theoc.&amp;lt;/Au&amp;gt;
  &amp;lt;/aS1&amp;gt;
  &amp;lt;aS1&amp;gt;
    &amp;lt;Indic&amp;gt;of a horse&amp;lt;/Indic&amp;gt;
    &amp;lt;Def&amp;gt;with light-coloured hair or mane&amp;lt;/Def&amp;gt;
    &amp;lt;Tr&amp;gt;golden&amp;lt;/Tr&amp;gt;
    &amp;lt;Au&amp;gt;B.&amp;lt;/Au&amp;gt;
  &amp;lt;/aS1&amp;gt;
&amp;lt;/AE&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Notice that this XML doesn&#39;t include the square brackets that will go around the etymology in the print version. They are treated entirely as presentation. The disjunction between the glosses &#39;fair&#39; and &#39;golden-haired&#39; is markup, not content. Even whitespace and punctuation have to largely come from the stylesheet. Definitions are distinguished from translation glosses and also applications like &#39;of a horse&#39;.&lt;/p&gt;
&lt;p&gt;This still doesn&#39;t go as far as I&#39;m talking about in terms of truly modelling the underlying linguistic information but it&#39;s a lot easier to do things with this computationally than just a markup of an existing print dictionary would be.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This year I&#39;ve been thinking about (and working on) the representation of lexical information quite a bit.</summary>
  </entry><entry>
    <title type="html">Release of text-validator 0.1</title>
    <link href="https://jktauber.com/2019/11/11/release-of-textvalidator-01/" rel="alternate" type="text/html" title="Release of text-validator 0.1"/>
    <published>2019-11-11T22:37:40-05:00</published>
    <updated>2019-11-11T22:37:40-05:00</updated>
    <id>https://jktauber.com/2019/11/11/release-of-textvalidator-01</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/11/release-of-textvalidator-01/">&lt;p&gt;I&#39;ve released a first version of a pluggable command-line tool for validating the formatting and orthography of text files.&lt;/p&gt;
&lt;p&gt;Various text projects like the &lt;a href=&#34;/tag/project:apostolic-fathers/&#34;&gt;apostolic-fathers&lt;/a&gt; have sometimes included little custom scripts I&#39;ve written to validate the files. Is the Unicode normalised? Are there stray characters or bad line endings? Are references in a valid format?&lt;/p&gt;
&lt;p&gt;I also had started included some Greek-specific tests in the &lt;a href=&#34;/tag/project:greek-normalisation/&#34;&gt;greek-normalisation&lt;/a&gt; library.&lt;/p&gt;
&lt;p&gt;But starting the &lt;a href=&#34;/tag/project:greek-texts/&#34;&gt;greek-texts&lt;/a&gt; project, I decided it would be nice to have a generic framework for writing text file validators that could be used for all sorts of projects and files.&lt;/p&gt;
&lt;p&gt;The result is &lt;code&gt;text-validator&lt;/code&gt;. Think of it like a code linter but for your text files.&lt;/p&gt;
&lt;p&gt;Each validator is its own Python module and, while a few basic tests are included in the library, the idea is that third parties can write their own validators and make them installable Python packages for others to use.&lt;/p&gt;
&lt;p&gt;You install &lt;code&gt;text-validator&lt;/code&gt; with&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;pip install text-validator
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;as well as installing any third-party plugins you want to use.&lt;/p&gt;
&lt;p&gt;You then config your validator plugins with a TOML file like:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;[&amp;quot;text_validator.plugins.whitespace&amp;quot;]
CHECK_CRLF = true
CHECK_TABS = true
CHECK_TRAILING_WHITESPACE = true
CHECK_NO_EOF_NEWLINE = true
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and run the command &lt;code&gt;validate-text&lt;/code&gt; to run your suite of configured plugins on the files in your text project.&lt;/p&gt;
&lt;p&gt;The GitHub repo is &lt;a href=&#34;https://github.com/jtauber/text-validator&#34;&gt;https://github.com/jtauber/text-validator&lt;/a&gt; and there you can also read more about &lt;a href=&#34;https://github.com/jtauber/text-validator/wiki/How-to-Write-a-Plugin&#34;&gt;How to Write a Plugin&lt;/a&gt; and look at the existing plugins in the &lt;a href=&#34;https://github.com/jtauber/text-validator/wiki/Plugin-Directory&#34;&gt;Plugin Directory&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Create issues in the GitHub repository if you have particular validators you like to see or would like to contribute.&lt;/p&gt;
&lt;p&gt;I haven&#39;t tried it yet but I&#39;d like to try hooking &lt;code&gt;text-validator&lt;/code&gt; up as a test that gets run on commits and pull requests on GitHub as part of a CI process.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve released a first version of a pluggable command-line tool for validating the formatting and orthography of text files.</summary>
  </entry><entry>
    <title type="html">Off to the UCLA Indo-European Conference Again</title>
    <link href="https://jktauber.com/2019/11/07/off-to-the-ucla-indoeuropean-conference-again/" rel="alternate" type="text/html" title="Off to the UCLA Indo-European Conference Again"/>
    <published>2019-11-07T03:51:20-05:00</published>
    <updated>2019-11-07T03:51:20-05:00</updated>
    <id>https://jktauber.com/2019/11/07/off-to-the-ucla-indoeuropean-conference-again</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/07/off-to-the-ucla-indoeuropean-conference-again/">&lt;p&gt;Today I&#39;m heading off to Los Angeles to attend the &lt;a href=&#34;https://pies.ucla.edu/IECprogram.html&#34;&gt;Thirty-First Annual UCLA Indo-European Conference&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I went two years ago and you may remember &lt;a href=&#34;{% post_url 2017-11-01-ucla-indo-european-conference %}&#34;&gt;my initial nervousness&lt;/a&gt; as a first-timer. But everyone was so nice and I got a lot out of it so I&#39;m headed back (plus this time I&#39;ll know more people).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Today I&#39;m heading off to Los Angeles to attend the &lt;a href=&#34;https://pies.ucla.edu/IECprogram.html&#34;&gt;Thirty-First Annual UCLA Indo-European Conference&lt;/a&gt;.</summary>
  </entry><entry>
    <title type="html">Subcorpus Vocabulary Statistics</title>
    <link href="https://jktauber.com/2019/11/05/subcorpus-vocabulary-statistics/" rel="alternate" type="text/html" title="Subcorpus Vocabulary Statistics"/>
    <published>2019-11-05T18:03:44-05:00</published>
    <updated>2019-11-05T18:03:44-05:00</updated>
    <id>https://jktauber.com/2019/11/05/subcorpus-vocabulary-statistics</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/05/subcorpus-vocabulary-statistics/">&lt;p&gt;Long-time readers of this blog know that, along with morphology, a core research area of mine is vocabulary. Prompted by Seumas Macdonald and now as part of the &lt;a href=&#34;{% post_url 2019-11-02-greek-texts-project %}&#34;&gt;Greek Texts Project&lt;/a&gt;, I started putting together some vocabulary coverage statistics for various subcorpora of Greek prose.&lt;/p&gt;
&lt;p&gt;I&#39;ve been publishing vocabulary coverage statistics for the Greek New Testament at least since 1996 (see &lt;a href=&#34;{% post_url 2007-11-04-gnt-verse-coverage-statistics %}&#34;&gt;GNT Verse Coverage Statistics&lt;/a&gt; and the more recent (and aptly named) &lt;a href=&#34;{% post_url 2015-10-26-updated-vocabulary-coverage-statistics %}&#34;&gt;Updated Vocabulary Coverage Statistics&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Back in &lt;a href=&#34;{% post_url 2015-10-30-core-vocabulary-new-testament-greek %}&#34;&gt;The Core Vocabulary of New Testament Greek&lt;/a&gt;, I looked at Wilfred Major&#39;s 50% and 80% lists for Classical Greek and constructed the equivalents for the Greek New Testament.&lt;/p&gt;
&lt;p&gt;For a while, on and off, I&#39;ve been working with reconciling Major&#39;s list with the &lt;a href=&#34;http://dcc.dickinson.edu/greek-core-list&#34;&gt;DCC Greek Core Vocabulary&lt;/a&gt;, my own GNT work, Helma Dik&#39;s amazing work on &lt;a href=&#34;https://logeion.uchicago.edu/&#34;&gt;Logeion&lt;/a&gt;, and other word lists based on frequency in some subcorpus of Ancient Greek. Back in early 2018, I also started &lt;a href=&#34;https://vocab.perseus.org&#34;&gt;https://vocab.perseus.org&lt;/a&gt;, primarily to serve up passage-specific vocabulary lists for &lt;a href=&#34;https://scaife.perseus.org&#34;&gt;https://scaife.perseus.org&lt;/a&gt; but also to enable exploration of vocabulary frequency in the &lt;a href=&#34;http://www.perseus.tufts.edu/&#34;&gt;Perseus Digital Library&lt;/a&gt; / &lt;a href=&#34;http://opengreekandlatin.org&#34;&gt;Open Greek and Latin&lt;/a&gt; corpus.&lt;/p&gt;
&lt;p&gt;As part of that last work, I put together an initial &#34;core reading list&#34; subcorpus based on works from reading lists from Harvard, Yale, and Tufts. My eventual goal was to allow the creation of custom reading lists and generate vocabulary for those. The data behind this was all based on an &lt;a href=&#34;https://github.com/gcelano/LemmatizedAncientGreekXML&#34;&gt;experimental lemmatisation&lt;/a&gt; done by Giuseppe Celano along with the &#34;short defs&#34; from Perseus via Logeion.&lt;/p&gt;
&lt;p&gt;I&#39;d been slowing getting back to my new &lt;a href=&#34;{% post_url 2019-04-20-consolidating-vocabulary-coverage-and-ordering-too %}&#34;&gt;vocabulary-tools&lt;/a&gt; code library for generating these kinds of stats for any lemmatised text when Seumas Macdonald asked about vocabulary in Plato, Lysias, and Xenophon—typical post-beginner prose.&lt;/p&gt;
&lt;p&gt;I took the opportunity to generalise some more of my code (although I haven&#39;t yet added it back to &lt;code&gt;vocabulary-tools&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Plato + Lysias + Xenophon, as lemmatised by Celano, is 745,213 tokens with 13,274 lemmas, 3,457 of which are hapakes within the subcorpus.&lt;/p&gt;
&lt;p&gt;Besides the actual list, what was of interest to both me and Seumas was:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;how many lemmas are needed for coverage points such as 80% or 98%&lt;/li&gt;
&lt;li&gt;what coverage particular numbers of lemmas gets you to (in frequency order)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now I&#39;ve talked at length here and in conference talks about the limitations of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;just going by overall token coverage not coverage of larger units like verses or sentences or paragraphs&lt;/li&gt;
&lt;li&gt;just going by lemmas and not considering morphology, syntactic constructions, etc&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;but this is still useful and interesting data.&lt;/p&gt;
&lt;h3&gt;Plato + Lysias + Xenophon&lt;/h3&gt;
&lt;p&gt;Perseus Plato + Lysias + Xenophon subcorpus:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;The 50% point is reached at   48 lemmas (2454 occurrences at that point)
The 80% point is reached at  439 lemmas ( 181 occurrences at that point)
The 90% point is reached at 1242 lemmas (  50 occurrences at that point)
The 95% point is reached at 2519 lemmas (  16 occurrences at that point)
The 98% point is reached at 5003 lemmas (   5 occurrences at that point)
---
The 81.40% point is reached at  500 lemmas (159 occurrences at that point)
The 88.15% point is reached at 1000 lemmas ( 66 occurrences at that point)
The 93.59% point is reached at 2000 lemmas ( 25 occurrences at that point)
The 97.21% point is reached at 4000 lemmas (  7 occurrences at that point)
The 99.19% point is reached at 8000 lemmas (  2 occurrences at that point)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Just to quickly unpack that: the first line says that you can account for 50% of the tokens in the subcorpus just with the top 48 lemmas (by frequency). Furthermore, those 48 lemmas all occur at least 2,452 times each in the subcorpus.&lt;/p&gt;
&lt;p&gt;Similarly, the second-to-last line says that the top 4,000 lemmas by frequency all occur at least 7 times in the subcorpus and account for 97.21% of tokens.&lt;/p&gt;
&lt;h3&gt;Plato&lt;/h3&gt;
&lt;p&gt;Just looking at Plato (with some extra lemma count breakpoints):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;The 50% point is reached at   43 lemmas (1365 occurrences at that point)
The 80% point is reached at  321 lemmas ( 120 occurrences at that point)
The 90% point is reached at  893 lemmas (  32 occurrences at that point)
The 95% point is reached at 1840 lemmas (  10 occurrences at that point)
The 98% point is reached at 3631 lemmas (   3 occurrences at that point)
---
The 84.74% point is reached at  500 lemmas (66 occurrences at that point)
The 90.91% point is reached at 1000 lemmas (27 occurrences at that point)
The 95.46% point is reached at 2000 lemmas ( 9 occurrences at that point)
The 97.34% point is reached at 3000 lemmas ( 4 occurrences at that point)
The 98.32% point is reached at 4000 lemmas ( 3 occurrences at that point)
The 98.93% point is reached at 5000 lemmas ( 2 occurrences at that point)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Plato Selection&lt;/h3&gt;
&lt;p&gt;With just a selection of Plato (Euthyphro, Apology, Crito, Symposium, Republic):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;The 50% point is reached at   43 lemmas (529 occurrences at that point)
The 80% point is reached at  335 lemmas ( 45 occurrences at that point)
The 90% point is reached at  908 lemmas ( 13 occurrences at that point)
The 95% point is reached at 1745 lemmas (  5 occurrences at that point)
The 98% point is reached at 3160 lemmas (  2 occurrences at that point)
---
The 84.31% point is reached at  500 lemmas (28 occurrences at that point)
The 90.85% point is reached at 1000 lemmas (11 occurrences at that point)
The 95.81% point is reached at 2000 lemmas ( 4 occurrences at that point)
The 97.76% point is reached at 3000 lemmas ( 2 occurrences at that point)
The 98.76% point is reached at 4000 lemmas ( 1 occurrences at that point)
The 99.51% point is reached at 5000 lemmas ( 1 occurrences at that point)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;New Testament&lt;/h3&gt;
&lt;p&gt;It&#39;s interesting to compare that to MorphGNT given it’s the same size:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;The 50% point is reached at   27 lemmas (662 occurrences at that point)
The 80% point is reached at  316 lemmas ( 48 occurrences at that point)
The 90% point is reached at  890 lemmas ( 13 occurrences at that point)
The 95% point is reached at 1753 lemmas (  5 occurrences at that point)
The 98% point is reached at 3103 lemmas (  2 occurrences at that point)
---
The 84.84% point is reached at  500 lemmas (27 occurrences at that point)
The 90.95% point is reached at 1000 lemmas (11 occurrences at that point)
The 95.80% point is reached at 2000 lemmas ( 4 occurrences at that point)
The 97.85% point is reached at 3000 lemmas ( 2 occurrences at that point)
The 98.94% point is reached at 4000 lemmas ( 1 occurrences at that point)
The 99.66% point is reached at 5000 lemmas ( 1 occurrences at that point)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;although note these figures are based on my own more-curated lemmatisation of the New Testament, not Celano&#39;s data which may have systematic differences that make this comparison slightly problematic.&lt;/p&gt;
&lt;h3&gt;Core Reading List&lt;/h3&gt;
&lt;p&gt;Here’s the “core reading list” with more 1000-markers:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;The 50% point is reached at    79 lemmas (2019 occurrences at that point)
The 80% point is reached at  1107 lemmas ( 134 occurrences at that point)
The 90% point is reached at  3020 lemmas (  39 occurrences at that point)
The 95% point is reached at  5948 lemmas (  14 occurrences at that point)
The 98% point is reached at 10920 lemmas (   5 occurrences at that point)
---
The 71.19% point is reached at   500 lemmas (310 occurrences at that point)
The 78.92% point is reached at  1000 lemmas (149 occurrences at that point)
The 86.17% point is reached at  2000 lemmas ( 69 occurrences at that point)
The 89.94% point is reached at  3000 lemmas ( 40 occurrences at that point)
The 92.28% point is reached at  4000 lemmas ( 26 occurrences at that point)
The 93.88% point is reached at  5000 lemmas ( 19 occurrences at that point)
The 95.05% point is reached at  6000 lemmas ( 14 occurrences at that point)
The 95.95% point is reached at  7000 lemmas ( 11 occurrences at that point)
The 96.65% point is reached at  8000 lemmas (  9 occurrences at that point)
The 97.20% point is reached at  9000 lemmas (  7 occurrences at that point)
The 97.66% point is reached at 10000 lemmas (  6 occurrences at that point)
The 98.33% point is reached at 12000 lemmas (  4 occurrences at that point)
The 98.98% point is reached at 15000 lemmas (  2 occurrences at that point)
The 99.57% point is reached at 20000 lemmas (  1 occurrences at that point)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;Full Perseus&lt;/h3&gt;
&lt;p&gt;And finally, here’s the full Perseus / OGL (as of two years ago with the Celano lemmatisation):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;The 50% point is reached at   42 lemmas (64483 occurrences at that point)
The 80% point is reached at  648 lemmas ( 3270 occurrences at that point)
The 90% point is reached at 1951 lemmas (  855 occurrences at that point)
The 95% point is reached at 4052 lemmas (  298 occurrences at that point)
The 98% point is reached at 8004 lemmas (   87 occurrences at that point)
---
The 77.44% point is reached at   500 lemmas (4263 occurrences at that point)
The 84.20% point is reached at  1000 lemmas (2018 occurrences at that point)
The 90.19% point is reached at  2000 lemmas ( 825 occurrences at that point)
The 93.14% point is reached at  3000 lemmas ( 480 occurrences at that point)
The 94.93% point is reached at  4000 lemmas ( 303 occurrences at that point)
The 96.10% point is reached at  5000 lemmas ( 208 occurrences at that point)
The 96.92% point is reached at  6000 lemmas ( 151 occurrences at that point)
The 97.53% point is reached at  7000 lemmas ( 114 occurrences at that point)
The 98.00% point is reached at  8000 lemmas (  87 occurrences at that point)
The 98.36% point is reached at  9000 lemmas (  69 occurrences at that point)
The 98.65% point is reached at 10000 lemmas (  55 occurrences at that point)
The 99.06% point is reached at 12000 lemmas (  36 occurrences at that point)
The 99.44% point is reached at 15000 lemmas (  20 occurrences at that point)
The 99.75% point is reached at 20000 lemmas (   8 occurrences at that point)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It is interesting how much quicker the 50-80-90-95-98 points are hit with the full corpus over the core reading list. Normally a larger corpus would take longer but I think it’s indicative of the fact that the “core reading” has a richer vocabulary per token than a larger sample (an interesting study in itself of any subcorpus).&lt;/p&gt;
&lt;h3&gt;Next Steps&lt;/h3&gt;
&lt;p&gt;Since calculating all this, Seumas and I have been working on a different prose subcorpus for post-beginner learners that combines the Plato selection, the New Testament, other orators in addition to Lysias and other history in addition to Xenophon. I&#39;ll talk about that work in some future posts (and hopefully Seumas will too!)&lt;/p&gt;
&lt;p&gt;I also want to talk about how the different subcorpora &lt;em&gt;differ&lt;/em&gt; in what lemmas they have. How much of the Plato 80% is in the New Testament 80%, for example (and vice versa)? There&#39;s also the question of &lt;a href=&#34;{% post_url 2018-01-21-lexical-dispersion-greek-new-testament-gries-dp %}&#34;&gt;lexical dispersion&lt;/a&gt;. There&#39;s value in separating function words from content words, and grouping lemmas into word families. Lots more coming.&lt;/p&gt;
&lt;p&gt;The code and for much this will added to &lt;code&gt;vocabulary-tools&lt;/code&gt; when I get a chance but if people are interested in other subcorpora before then, please get in contact with me.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2019-11-06)&lt;/strong&gt;: Now see Seumas&#39;s post &lt;a href=&#34;https://thepatrologist.com/2019/11/06/sore-thumbs-in-subcorpus-vocabulary/&#34;&gt;Sore Thumbs in Subcorpus vocabulary&lt;/a&gt; looking at particular words that differ in frequency between the New Testament and the larger Classical Greek prose subcorpus we&#39;ve been working with.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Long-time readers of this blog know that, along with morphology, a core research area of mine is vocabulary. Prompted by Seumas Macdonald and now as part of the &lt;a href=&#34;{% post_url 2019-11-02-greek-texts-project %}&#34;&gt;Greek Texts Project&lt;/a&gt;, I started putting together some vocabulary coverage statistics for various subcorpora of Greek prose.</summary>
  </entry><entry>
    <title type="html">New Blog Platform</title>
    <link href="https://jktauber.com/2019/11/03/new-blog-platform/" rel="alternate" type="text/html" title="New Blog Platform"/>
    <published>2019-11-03T16:55:52-05:00</published>
    <updated>2019-11-03T16:55:52-05:00</updated>
    <id>https://jktauber.com/2019/11/03/new-blog-platform</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/03/new-blog-platform/">&lt;p&gt;Following on from my success with it on the &lt;a href=&#34;https://digitaltolkien.com&#34;&gt;Digital Tolkien Project&lt;/a&gt; website, I decided to switch to using &lt;a href=&#34;https://jekyllrb.com&#34;&gt;Jekyll&lt;/a&gt; for the generation of &lt;strong&gt;jktauber.com&lt;/strong&gt; as a static site.&lt;/p&gt;
&lt;p&gt;I wanted to switch to static site generation partly for ease of hosting but mostly to make it much easier to author content locally and manage revisions on GitHub. My choice of Jekyll as a platform was a combination of ease-of-use and ability to trivially host on GitHub pages.&lt;/p&gt;
&lt;p&gt;There may be a couple of issues with my migration, so let me know if you see anything funny, especially with the formatting of posts. Note also that I haven&#39;t brought back full-text search or the &lt;strong&gt;Labs&lt;/strong&gt; yet.&lt;/p&gt;
&lt;p&gt;But the change already makes it easier for me to write posts as well as organise existing posts. I&#39;ve already started adding tags on a handful of posts to make it easier for you to get to all the posts on a particular topic. It&#39;s now possible, for example, to easily get a list of my &lt;a href=&#34;/tag/morphology-tour/&#34;&gt;morphology tour&lt;/a&gt; posts, for example.&lt;/p&gt;
&lt;p&gt;I also made a couple of style tweaks but nothing major.&lt;/p&gt;
&lt;p&gt;I&#39;m now back to posting a lot more regularly with lots of good vocabulary and morphology stuff coming up!&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Following on from my success with it on the &lt;a href=&#34;https://digitaltolkien.com&#34;&gt;Digital Tolkien Project&lt;/a&gt; website, I decided to switch to using &lt;a href=&#34;https://jekyllrb.com&#34;&gt;Jekyll&lt;/a&gt; for the generation of &lt;strong&gt;jktauber.com&lt;/strong&gt; as a static site.</summary>
  </entry><entry>
    <title type="html">Greek Texts Project</title>
    <link href="https://jktauber.com/2019/11/02/greek-texts-project/" rel="alternate" type="text/html" title="Greek Texts Project"/>
    <published>2019-11-02</published>
    <updated>2019-11-02</updated>
    <id>https://jktauber.com/2019/11/02/greek-texts-project</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/02/greek-texts-project/">&lt;p&gt;A twitter conversation led to the creation of a new project to work on annotated Greek texts for language learners.&lt;/p&gt;
&lt;p&gt;As readers of this blog know, I&#39;m working with Seumas Macdonald on the &lt;a href=&#34;/2019/02/01/initial-apostolic-fathers-text-complete/&#34;&gt;Apostolic Fathers&lt;/a&gt; and had done some earlier work on Epictetus&#39;s &lt;a href=&#34;https://github.com/jtauber/enchiridion&#34;&gt;Enchiridion&lt;/a&gt; and with Nathan Smith on the &lt;a href=&#34;https://github.com/nathans/lxx-swete&#34;&gt;Septuagint&lt;/a&gt;. There&#39;d been a couple of conversations earlier in October on Twitter with people wanting to make progress on the Septuagint and I&#39;d also been keen to get back to supporting Seumas&#39;s &lt;a href=&#34;https://github.com/seumasjeltzz/LinguaeGraecaePerSeIllustrata&#34;&gt;Linguae Graecae Per Se Illustrata&lt;/a&gt; project.&lt;/p&gt;
&lt;p&gt;So Fletcher Hardison&#39;s tweet triggered the idea to maybe get everyone involved in these various projects talking to each other:&lt;/p&gt;
&lt;blockquote class=&#34;twitter-tweet&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;We should definitely coordinate file formats for things like so we can build common tools /cc &lt;a href=&#34;https://twitter.com/jeltzz?ref_src=twsrc%5Etfw&#34;&gt;@jeltzz&lt;/a&gt; &lt;a href=&#34;https://twitter.com/sleeptillseven?ref_src=twsrc%5Etfw&#34;&gt;@sleeptillseven&lt;/a&gt; &lt;a href=&#34;https://twitter.com/_ndsmith?ref_src=twsrc%5Etfw&#34;&gt;@_ndsmith&lt;/a&gt;. I&amp;#39;d love to consolidate AF, LXX, Epictetus, LGPSI, MorphGNT, κτλ.&lt;/p&gt;&amp;mdash; James Tauber (@jtauber) &lt;a href=&#34;https://twitter.com/jtauber/status/1182695318506430467?ref_src=twsrc%5Etfw&#34;&gt;October 11, 2019&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;

&lt;p&gt;I started a Slack workspace and a landing page: &lt;a href=&#34;https://jtauber.github.io/greek-texts/&#34;&gt;https://jtauber.github.io/greek-texts/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The results so far have been wonderful. A great group of people are already working together on various texts and data format conventions. I&#39;m back to working on vocabulary lists (more on that in the next few days) and some new Python libraries (more on that also in the next few days).&lt;/p&gt;
&lt;p&gt;We&#39;d love to have you join us! Just email me at jtauber@jtauber.com and I can invite you to the Slack workspace.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A twitter conversation led to the creation of a new project to work on annotated Greek texts for language learners.</summary>
  </entry><entry>
    <title type="html">Release of greek-normalisation 0.3</title>
    <link href="https://jktauber.com/2019/11/01/release-greek-normalisation-03/" rel="alternate" type="text/html" title="Release of greek-normalisation 0.3"/>
    <published>2019-11-01</published>
    <updated>2019-11-01</updated>
    <id>https://jktauber.com/2019/11/01/release-greek-normalisation-03</id>
    <content type="html" xml:base="https://jktauber.com/2019/11/01/release-greek-normalisation-03/">&lt;p&gt;In the last couple of weeks I&#39;ve done a couple of minor releases of the &lt;code&gt;greek-normalisation&lt;/code&gt; Python library which brings together various code I use to clean up Greek texts and normalise the forms.&lt;/p&gt;
&lt;p&gt;The 0.2 release (which I neglected to announce) just had a small fix to the &lt;code&gt;breathing_check&lt;/code&gt; function to support things like ἀϊ (which failed before because it didn&#39;t take into account the diaeresis). Soon I&#39;ll blog about a new Python tool I&#39;ve been building which will provide a framework for doing lots of checks like this.&lt;/p&gt;
&lt;p&gt;The 0.3 release now installs two command-line scripts &lt;code&gt;toNFC&lt;/code&gt; and &lt;code&gt;toNFD&lt;/code&gt; to convert a file to either an NFC or NFD Unicode Normalization Form.&lt;/p&gt;
&lt;p&gt;Once installed you can do things like:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;toNFC source.txt &amp;gt; nfc_version.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The repository is &lt;a href=&#34;https://github.com/jtauber/greek-normalisation&#34;&gt;https://github.com/jtauber/greek-normalisation&lt;/a&gt; and it&#39;s pip-installable as &lt;code&gt;greek-normalisation&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;See my previous post &lt;a href=&#34;/2018/07/23/normalisation-column-morphgnt/&#34;&gt;The Normalisation Column in MorphGNT&lt;/a&gt; for the original work this code came form.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In the last couple of weeks I&#39;ve done a couple of minor releases of the &lt;code&gt;greek-normalisation&lt;/code&gt; Python library which brings together various code I use to clean up Greek texts and normalise the forms.</summary>
  </entry><entry>
    <title type="html">Summer Conferences</title>
    <link href="https://jktauber.com/2019/07/06/summer-conferences/" rel="alternate" type="text/html" title="Summer Conferences"/>
    <published>2019-07-06</published>
    <updated>2019-07-06</updated>
    <id>https://jktauber.com/2019/07/06/summer-conferences</id>
    <content type="html" xml:base="https://jktauber.com/2019/07/06/summer-conferences/">&lt;p&gt;Here are the conferences I&#39;m attending (and in some cases, presenting at) in June through August. I probably should have posted this at the start of my conference travel, but here it is.&lt;/p&gt;
&lt;h3&gt;&lt;a href=&#34;https://lila-erc.eu/1st-lila-ws/&#34;&gt;First LiLa Workshop: Linguistic Resources &amp;amp; NLP Tools for Latin&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;I&#39;m excited about the LiLa project, which is about a Linguistic Linked Open Data (LLOD) approach to Latin resources. Because I&#39;m interested in LLOD for Ancient Greek, I was keen to attend the first workshop to get ideas, but then I got asked to speak about &lt;a href=&#34;https://scaife-viewer.org&#34;&gt;Scaife&lt;/a&gt; anyway.&lt;/p&gt;
&lt;h3&gt;&lt;a href=&#34;http://versologie.cz/conference2019/index.php&#34;&gt;Quantitative Approaches to Versification&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;This was a conference about computational analysis of poetry (especially meter). I had done some work with Sophia Sklaviadis on the relationship between repeating n-grams and metrical position in Homer and presented a paper on it at this conference. Not normally my area but I have some more ideas to persue that I might write about here at some point.&lt;/p&gt;
&lt;h3&gt;&lt;a href=&#34;https://vocabatleuven.wordpress.com&#34;&gt;Vocab@LEUVEN&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;When I went to the American Association for Applied Linguistics annual meeting last year, I mostly attended the track on vocabulary research. Regular readers of this blog know that, along with morphology, it&#39;s my main research area. Well, the Vocab@ conferences are 100% vocabulary research. I did actually submit a paper to this conference that got rejected but I&#39;ll be presenting it as a poster at EuroCALL (see below).&lt;/p&gt;
&lt;h3&gt;&lt;a href=&#34;https://dh2019.adho.org&#34;&gt;Digital Humanities Conference 2019&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The big DH conference of the year. Will be my first time attending and I&#39;m sure it will be overwhelming. I&#39;m presenting as part of a panel on &lt;em&gt;Confronting the Complexity of Babel in a Global and Digital Age&lt;/em&gt; and I&#39;ll specifically be talking about online reading environments to scaffold understanding of texts in historical languages.&lt;/p&gt;
&lt;p&gt;After this I&#39;m briefly heading back to Boston for a couple of weeks. Then two Tolkien-related conferences:&lt;/p&gt;
&lt;h3&gt;&lt;a href=&#34;http://omentielva.com&#34;&gt;Omentielva&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;This is the International Conference on J.R.R. Tolkien’s Invented Languages. Not speaking (Elvish or otherwise) just attending.&lt;/p&gt;
&lt;h3&gt;&lt;a href=&#34;https://www.tolkien2019.com&#34;&gt;Tolkien 2019&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Giving a talk on &lt;em&gt;Tolkien and Digital Philology&lt;/em&gt;, basically how we might treat Tolkien&#39;s works as the objects of philological study and use the same digital methods one might for, say, an Ancient Greek text. The talk will culminate in me outlining my vision for the &lt;a href=&#34;https://digitaltolkien.com&#34;&gt;Digital Tolkien&lt;/a&gt; project.&lt;/p&gt;
&lt;p&gt;And finally:&lt;/p&gt;
&lt;h3&gt;&lt;a href=&#34;https://www.eurocall-languages.org/conferences/current-conference&#34;&gt;EUROCALL 2019&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;This is the major European conference for Computer-Aided Language Learning. I&#39;m presenting a poster on what is possibly the longest running topic of this blog: the sequencing of vocabulary learning from texts. There&#39;ll be lots more blog posts here on that in the future!&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Here are the conferences I&#39;m attending (and in some cases, presenting at) in June through August. I probably should have posted this at the start of my conference travel, but here it is.</summary>
  </entry><entry>
    <title type="html">Release of greek-normalisation 0.1</title>
    <link href="https://jktauber.com/2019/07/06/release-greek-normalisation-01/" rel="alternate" type="text/html" title="Release of greek-normalisation 0.1"/>
    <published>2019-07-06</published>
    <updated>2019-07-06</updated>
    <id>https://jktauber.com/2019/07/06/release-greek-normalisation-01</id>
    <content type="html" xml:base="https://jktauber.com/2019/07/06/release-greek-normalisation-01/">&lt;p&gt;For years I’ve had Python code for normalising Greek forms, checking for stray characters, etc. I finally got around to consolidating them in a library.&lt;/p&gt;
&lt;p&gt;It has a few little utilities like:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; strip_last_accent_if_two(&#39;γυναῖκά&#39;)
&#39;γυναῖκα&#39;

&amp;gt;&amp;gt;&amp;gt; grave_to_acute(&#39;τὴν&#39;)
&#39;τήν&#39;

&amp;gt;&amp;gt;&amp;gt; breathing_check(&#39;ἀι&#39;)
False
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;but the core of it is the normalisation of tokens with knowledge of clitics and elision.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; normalise(&#39;τὴν&#39;)
(&#39;τήν&#39;, [&#39;grave&#39;])

&amp;gt;&amp;gt;&amp;gt; normalise(&#39;γυναῖκά&#39;)
(&#39;γυναῖκα&#39;, [&#39;extra&#39;])

&amp;gt;&amp;gt;&amp;gt; normalise(&#39;σου&#39;)
(&#39;σου&#39;, [&#39;enclitic&#39;])

&amp;gt;&amp;gt;&amp;gt; normalise(&#39;Τὴν&#39;)
(&#39;τήν&#39;, [&#39;grave&#39;, &#39;capitalisation&#39;])

&amp;gt;&amp;gt;&amp;gt; normalise(&#39;ὁ&#39;)
(&#39;ὁ&#39;, [&#39;proclitic&#39;])

&amp;gt;&amp;gt;&amp;gt; normalise(&#39;μετ’&#39;)
(&#39;μετά&#39;, [&#39;elision&#39;])

&amp;gt;&amp;gt;&amp;gt; normalise(&#39;οὐκ&#39;)
(&#39;οὐ&#39;, [&#39;movable&#39;, &#39;proclitic&#39;])
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;See my previous post &lt;a href=&#34;/2018/07/23/normalisation-column-morphgnt/&#34;&gt;The Normalisation Column in MorphGNT&lt;/a&gt; for the original work this code came form.&lt;/p&gt;
&lt;p&gt;There are also some regular expressions that I&#39;ve used to check mistakes in things like the &lt;a href=&#34;https://jtauber.github.io/apostolic-fathers/&#34;&gt;Open Apostolic Fathers&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It&#39;s just an initial 0.1 release but parts of the code have already been in use for years.&lt;/p&gt;
&lt;p&gt;The repository is &lt;a href=&#34;https://github.com/jtauber/greek-normalisation&#34;&gt;https://github.com/jtauber/greek-normalisation&lt;/a&gt; and it&#39;s pip-installable as &lt;code&gt;greek-normalisation&lt;/code&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">For years I’ve had Python code for normalising Greek forms, checking for stray characters, etc. I finally got around to consolidating them in a library.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 28</title>
    <link href="https://jktauber.com/2019/04/30/tour-greek-morphology-part-28/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 28"/>
    <published>2019-04-30</published>
    <updated>2019-04-30</updated>
    <id>https://jktauber.com/2019/04/30/tour-greek-morphology-part-28</id>
    <content type="html" xml:base="https://jktauber.com/2019/04/30/tour-greek-morphology-part-28/">&lt;p&gt;Part twenty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In this post, we look systematically at the imperfect active distinguishers in much the same way as we did the present active distinguishers in &lt;a href=&#34;/2017/08/26/tour-greek-morphology-part-13/&#34;&gt;Part 13&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Before we summarise all the distinguisher paradigms we&#39;ve seen so far, there are actually three forms in the SBLGNT not covered yet: εἰσῄει, παρῆσαν, and συνῆσαν (all in Luke/Acts). εἰσῄει is from εἰς+εἶμι (making it a compound of &lt;strong&gt;IA-11&lt;/strong&gt;) and παρῆσαν is παρά+εἰμί (making it a compound of &lt;strong&gt;IA-10&lt;/strong&gt;). In our text, συνῆσαν is from σύν+εἰμί but &lt;em&gt;could&lt;/em&gt; be from σύν+εἶμι. Either way, for completeness we need to add &lt;strong&gt;IA-10-COMP&lt;/strong&gt; and &lt;strong&gt;IA-11-COMP&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;So with those, here are all the imperfect active distinguisher paradigms we&#39;ve discussed:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-1&lt;/th&gt;
&lt;th&gt;IA-2&lt;/th&gt;
&lt;th&gt;IA-3&lt;/th&gt;
&lt;th&gt;IA-4&lt;/th&gt;
&lt;th&gt;IA-5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xον&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xες&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xους&lt;/td&gt;
&lt;td&gt;Xᾱς&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xε(ν)&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xᾱ&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xεῖτε&lt;/td&gt;
&lt;td&gt;Xοῦτε&lt;/td&gt;
&lt;td&gt;Xᾶτε&lt;/td&gt;
&lt;td&gt;Xῆτε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xον&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-6&lt;/th&gt;
&lt;th&gt;IA-7&lt;/th&gt;
&lt;th&gt;IA-8&lt;/th&gt;
&lt;th&gt;IA-9&lt;/th&gt;
&lt;th&gt;IA-9b&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xῡν&lt;/td&gt;
&lt;td&gt;Xην/Xειν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xην&lt;/td&gt;
&lt;td&gt;Xην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xῡς&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xους&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;td&gt;Xης/Xησθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xῡ&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xυμεν&lt;/td&gt;
&lt;td&gt;Xεμεν&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xαμεν&lt;/td&gt;
&lt;td&gt;Xαμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xυτε&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xοτε&lt;/td&gt;
&lt;td&gt;Xατε&lt;/td&gt;
&lt;td&gt;Xατε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xυσαν&lt;/td&gt;
&lt;td&gt;Xεσαν&lt;/td&gt;
&lt;td&gt;Xοσαν&lt;/td&gt;
&lt;td&gt;Xασαν&lt;/td&gt;
&lt;td&gt;Xασαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-10&lt;/th&gt;
&lt;th&gt;IA-11&lt;/th&gt;
&lt;th&gt;IA-10-COMP&lt;/th&gt;
&lt;th&gt;IA-11-COMP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;ἦ/ἦν&lt;/td&gt;
&lt;td&gt;ᾖα/ᾔειν&lt;/td&gt;
&lt;td&gt;Xῆ/Xῆν&lt;/td&gt;
&lt;td&gt;Xῇα/Xῄειν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;ἦς/ἦσθα&lt;/td&gt;
&lt;td&gt;ᾔεις/ᾔεισθα&lt;/td&gt;
&lt;td&gt;Xῆς/Xῆσθα&lt;/td&gt;
&lt;td&gt;Xῄεις/Xῄεισθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;ἦν&lt;/td&gt;
&lt;td&gt;ᾔει(ν)&lt;/td&gt;
&lt;td&gt;Xῆν&lt;/td&gt;
&lt;td&gt;Xῄει(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;ἦμεν&lt;/td&gt;
&lt;td&gt;ᾖμεν&lt;/td&gt;
&lt;td&gt;Xῆμεν&lt;/td&gt;
&lt;td&gt;Xῇμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;ἦτε&lt;/td&gt;
&lt;td&gt;ᾖτε&lt;/td&gt;
&lt;td&gt;Xῆτε&lt;/td&gt;
&lt;td&gt;Xῇτε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;ἦσαν&lt;/td&gt;
&lt;td&gt;ᾖσαν/ᾔεσαν&lt;/td&gt;
&lt;td&gt;Xῆσαν&lt;/td&gt;
&lt;td&gt;Xῇσαν/Xῄεσαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;It will be worth taking some future posts to talk about the -σθα ending that crops up in the &lt;strong&gt;2SG&lt;/strong&gt; as well as some of the more extraordinary forms in &lt;strong&gt;IA-10&lt;/strong&gt; and &lt;strong&gt;IA-11&lt;/strong&gt; (along with compounds).&lt;/p&gt;
&lt;p&gt;But for now, just capturing the common element in each row (like we did in Part 13):&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;    &lt;th nowrap&gt;IA-1&lt;/th&gt; &lt;th nowrap&gt;IA-2&lt;/th&gt; &lt;th nowrap&gt;IA-3&lt;/th&gt; &lt;th nowrap&gt;IA-4&lt;/th&gt; &lt;th nowrap&gt;IA-5&lt;/th&gt; &lt;th nowrap&gt;IA-6&lt;/th&gt; &lt;th nowrap&gt;IA-7&lt;/th&gt; &lt;th nowrap&gt;IA-8&lt;/th&gt; &lt;th nowrap&gt;IA-9&lt;/th&gt; &lt;th nowrap&gt;IA-10&lt;/th&gt;   &lt;th nowrap&gt;IA-11&lt;/th&gt; &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1SG&lt;/th&gt; &lt;td colspan=&#34;11&#34; class=&#34;text-center&#34;&gt;-ν&lt;/td&gt;   &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2SG&lt;/th&gt; &lt;td colspan=&#34;9&#34; class=&#34;text-center&#34;&gt;-ς&lt;/td&gt;    &lt;td colspan=&#34;2&#34; class=&#34;text-center&#34;&gt;-ς/-σθα&lt;/td&gt; &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3SG&lt;/th&gt; &lt;td colspan=&#34;9&#34; class=&#34;text-center&#34;&gt;-&lt;/td&gt;     &lt;td colspan=&#34;2&#34; class=&#34;text-center&#34;&gt;-(v)&lt;/td&gt; &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1PL&lt;/th&gt; &lt;td colspan=&#34;11&#34; class=&#34;text-center&#34;&gt;-μεν&lt;/td&gt; &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2PL&lt;/th&gt; &lt;td colspan=&#34;11&#34; class=&#34;text-center&#34;&gt;-τε&lt;/td&gt;          &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3PL&lt;/th&gt; &lt;td colspan=&#34;5&#34; class=&#34;text-center&#34;&gt;-ν&lt;/td&gt;   &lt;td colspan=&#34;6&#34; class=&#34;text-center&#34;&gt;-σαν&lt;/td&gt;   &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;As with the present active paradigms, some cells across inflectional classes have identical distinguishers and so those cells alone can&#39;t identify the inflectional class (and hence all the other forms in that class). In particular:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;1SG&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;IA-2&lt;/strong&gt;, &lt;strong&gt;IA-3&lt;/strong&gt;, &lt;strong&gt;IA-8&lt;/strong&gt;} or within the set {&lt;strong&gt;IA-4&lt;/strong&gt;, &lt;strong&gt;IA-5&lt;/strong&gt;} or within the set {&lt;strong&gt;IA-7&lt;/strong&gt; (if η), &lt;strong&gt;IA-9&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;2SG&lt;/strong&gt; and &lt;strong&gt;3SG&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;IA-2&lt;/strong&gt;, &lt;strong&gt;IA-7&lt;/strong&gt;} or within the set {&lt;strong&gt;IA-3&lt;/strong&gt;, &lt;strong&gt;IA-8&lt;/strong&gt;} or within the set {&lt;strong&gt;IA-5&lt;/strong&gt;, &lt;strong&gt;IA-9&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;1PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;IA-2&lt;/strong&gt;, &lt;strong&gt;IA-3&lt;/strong&gt;} or within the set {&lt;strong&gt;IA-4&lt;/strong&gt;, &lt;strong&gt;IA-5&lt;/strong&gt;} or within the set {&lt;strong&gt;IA-1&lt;/strong&gt;, &lt;strong&gt;IA-8&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;2PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;IA-1&lt;/strong&gt;, &lt;strong&gt;IA-7&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;3PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;IA-2&lt;/strong&gt;, &lt;strong&gt;IA-3&lt;/strong&gt;} or within the set {&lt;strong&gt;IA-4&lt;/strong&gt;, &lt;strong&gt;IA-5&lt;/strong&gt;}&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The distinctions from &lt;strong&gt;IA-7&lt;/strong&gt; on up are less important because they are tiny, non-productive classes. Looking at just &lt;strong&gt;IA-1&lt;/strong&gt; through &lt;strong&gt;IA-6&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;{&lt;strong&gt;IA-2&lt;/strong&gt;, &lt;strong&gt;IA-3&lt;/strong&gt;} can&#39;t be distinguished by &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, or &lt;strong&gt;3PL&lt;/strong&gt; but &lt;em&gt;can&lt;/em&gt; by &lt;strong&gt;2SG&lt;/strong&gt;, &lt;strong&gt;3SG&lt;/strong&gt;, or &lt;strong&gt;2PL&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;{&lt;strong&gt;IA-4&lt;/strong&gt;, &lt;strong&gt;IA-5&lt;/strong&gt;} also can&#39;t be distinguished by &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, or &lt;strong&gt;3PL&lt;/strong&gt; but &lt;em&gt;can&lt;/em&gt; by &lt;strong&gt;2SG&lt;/strong&gt;, &lt;strong&gt;3SG&lt;/strong&gt;, or &lt;strong&gt;2PL&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So at least for the first six classes, any of &lt;strong&gt;2SG&lt;/strong&gt;, &lt;strong&gt;3SG&lt;/strong&gt;, or &lt;strong&gt;2PL&lt;/strong&gt; uniquely identifies the class (at least within the imperfect active system).&lt;/p&gt;
&lt;p&gt;It is interesting then that the &lt;strong&gt;2SG&lt;/strong&gt; and &lt;strong&gt;3SG&lt;/strong&gt; are the very cells most likely to cause confusion within the sets {&lt;strong&gt;IA-2&lt;/strong&gt;, &lt;strong&gt;IA-7&lt;/strong&gt;}, {&lt;strong&gt;IA-3&lt;/strong&gt;, &lt;strong&gt;IA-8&lt;/strong&gt;}, and {&lt;strong&gt;IA-5&lt;/strong&gt;, &lt;strong&gt;IA-9&lt;/strong&gt;} and in those cases, it is the &lt;strong&gt;1PL&lt;/strong&gt; or &lt;strong&gt;3PL&lt;/strong&gt; that can come to the rescue in identifying the class (although the value of X itself can do that given the tiny size of the &lt;strong&gt;IA-7&lt;/strong&gt;, &lt;strong&gt;IA-8&lt;/strong&gt; and &lt;strong&gt;IA-9&lt;/strong&gt; classes).&lt;/p&gt;
&lt;p&gt;If we try to group our classes along the lines we did in &lt;a href=&#34;/2017/08/26/tour-greek-morphology-part-13/&#34;&gt;Part 13&lt;/a&gt;, we get a hierarchy very similar to that in the present:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td colspan=&#34;3&#34;&gt;&lt;b&gt;IA-&lt;/b&gt;{&lt;b&gt;1&lt;/b&gt;, &lt;b&gt;2&lt;/b&gt;, &lt;b&gt;3&lt;/b&gt;, &lt;b&gt;4&lt;/b&gt;, &lt;b&gt;5&lt;/b&gt;}&lt;/td&gt;
    &lt;td colspan=&#34;3&#34;&gt;&lt;b&gt;3PL&lt;/b&gt; in -ν; &lt;b&gt;1SG&lt;/b&gt; and &lt;b&gt;3PL&lt;/b&gt; identical&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34;&gt;&lt;b&gt;IA-&lt;/b&gt;{&lt;b&gt;2&lt;/b&gt;, &lt;b&gt;3&lt;/b&gt;, &lt;b&gt;4&lt;/b&gt;, &lt;b&gt;5&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34;&gt;long vowels before the endings; circumflexes in the &lt;b&gt;1PL&lt;/b&gt; and &lt;b&gt;2PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;IA-&lt;/b&gt;{&lt;b&gt;2&lt;/b&gt;, &lt;b&gt;3&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;ου in &lt;b&gt;1SG&lt;/b&gt;, &lt;b&gt;1PL&lt;/b&gt;, and &lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;IA-&lt;/b&gt;{&lt;b&gt;4&lt;/b&gt;, &lt;b&gt;5&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;ω in &lt;b&gt;1SG&lt;/b&gt;, &lt;b&gt;1PL&lt;/b&gt;, and &lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td colspan=&#34;3&#34; nowrap&gt;&lt;b&gt;IA-&lt;/b&gt;{&lt;b&gt;6&lt;/b&gt;, &lt;b&gt;7&lt;/b&gt;, &lt;b&gt;8&lt;/b&gt;, &lt;b&gt;9&lt;/b&gt;, &lt;b&gt;9b&lt;/b&gt;, &lt;b&gt;10&lt;/b&gt;, &lt;b&gt;11&lt;/b&gt;, &lt;b&gt;10-COMP&lt;/b&gt;, &lt;b&gt;11-COMP&lt;/b&gt;}&lt;/td&gt;
    &lt;td colspan=&#34;3&#34;&gt;&lt;b&gt;3PL&lt;/b&gt; in -σαν&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34;&gt;&lt;b&gt;IA-&lt;/b&gt;{&lt;b&gt;6&lt;/b&gt;, &lt;b&gt;7&lt;/b&gt;, &lt;b&gt;8&lt;/b&gt;, &lt;b&gt;9&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34;&gt;&lt;b&gt;2SG&lt;/b&gt; only in -ς&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34;&gt;&lt;b&gt;IA-&lt;/b&gt;{&lt;b&gt;9b&lt;/b&gt;, &lt;b&gt;10&lt;/b&gt;, &lt;b&gt;11&lt;/b&gt;, &lt;b&gt;10-COMP&lt;/b&gt;, &lt;b&gt;11-COMP&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34;&gt;&lt;b&gt;2SG&lt;/b&gt; in -ς/-σθα&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;along with cross-cutting categories such as:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;IA-&lt;/strong&gt;{&lt;strong&gt;2&lt;/strong&gt;, &lt;strong&gt;3&lt;/strong&gt;, &lt;strong&gt;8&lt;/strong&gt;} | ουν in &lt;strong&gt;1SG&lt;/strong&gt;                      |
| &lt;strong&gt;IA-&lt;/strong&gt;{&lt;strong&gt;2&lt;/strong&gt;, &lt;strong&gt;7&lt;/strong&gt;}        | ει in &lt;strong&gt;2SG&lt;/strong&gt; and &lt;strong&gt;3SG&lt;/strong&gt;           |
| &lt;strong&gt;IA-&lt;/strong&gt;{&lt;strong&gt;3&lt;/strong&gt;, &lt;strong&gt;8&lt;/strong&gt;}        | ου in &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;2SG&lt;/strong&gt;, and &lt;strong&gt;3SG&lt;/strong&gt; |
| &lt;strong&gt;IA-&lt;/strong&gt;{&lt;strong&gt;1&lt;/strong&gt;, &lt;strong&gt;7&lt;/strong&gt;}        | ετε in &lt;strong&gt;2PL&lt;/strong&gt;                      |&lt;/p&gt;
&lt;p&gt;and, ignoring accents:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;IA-&lt;/strong&gt;{&lt;strong&gt;4&lt;/strong&gt;, &lt;strong&gt;9&lt;/strong&gt;} | ατε in &lt;strong&gt;2PL&lt;/strong&gt; |&lt;/p&gt;
&lt;p&gt;But given the closed nature of &lt;strong&gt;IA-7&lt;/strong&gt; and up, many of these will be easy to disambiguate. We&#39;ll go through the details in a future post.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Consolidating Vocabulary Coverage and Ordering Tools</title>
    <link href="https://jktauber.com/2019/04/20/consolidating-vocabulary-coverage-and-ordering-too/" rel="alternate" type="text/html" title="Consolidating Vocabulary Coverage and Ordering Tools"/>
    <published>2019-04-20</published>
    <updated>2019-04-20</updated>
    <id>https://jktauber.com/2019/04/20/consolidating-vocabulary-coverage-and-ordering-too</id>
    <content type="html" xml:base="https://jktauber.com/2019/04/20/consolidating-vocabulary-coverage-and-ordering-too/">&lt;p&gt;One of my goals for 2019 is to bring more structure to various disperate Greek projects and, as part of that, I’ve started consolidating multiple one-off projects I’ve done around vocabulary coverage statistics and ordering experiments.&lt;/p&gt;
&lt;p&gt;Going back at least 15 years (when I first started blogging about &lt;a href=&#34;/2004/11/26/programmed-vocabulary-learning-travelling-salesman/&#34;&gt;Programmed Vocabulary Learning&lt;/a&gt;) I’ve had little Python scripts all over the place to calculate various stats, or try out various approaches to ordering.&lt;/p&gt;
&lt;p&gt;I’m bringing all of that together in a single repository and updating the code so:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;it’s all in one place&lt;/li&gt;
&lt;li&gt;it’s usable as a library in other projects or in things like Jupyter notebooks&lt;/li&gt;
&lt;li&gt;it can be extended to arbitrary chunking beyond verses (e.g. books, chapters, sentences, paragraphs, pericopes)&lt;/li&gt;
&lt;li&gt;it can be extended to other texts such as the Apostolic Fathers, Homer, etc (other languages too!)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I’m partly spurred on by a desire to explore more stuff &lt;a href=&#34;https://thepatrologist.com&#34;&gt;Seumas Macdonald&lt;/a&gt; have been talking about and be more responsive to the occasional inquiries I get from Greek teachers. Also I have a poster &lt;em&gt;Vocabulary Ordering in Text-Driven Historical Language Instruction: Sequencing the Ancient Greek Vocabulary of Homer and the New Testament&lt;/em&gt; that got accepted for &lt;a href=&#34;https://sites.uclouvain.be/eurocall2019/&#34;&gt;EUROCALL 2019&lt;/a&gt; in August and this code library helps me not only produce the poster but also make it more reproducible.&lt;/p&gt;
&lt;p&gt;Ultimately I hope to write a paper or two out of it as well.&lt;/p&gt;
&lt;p&gt;I’ve started the repo at:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/vocabulary-tools/&#34;&gt;https://github.com/jtauber/vocabulary-tools/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;where I’ve basically rewritten half of my existing code from elsewhere so far. I’ve reproduced the code for generating core vocabulary lists and also the coverage tables I’ve used in multiple talks (including my BibleTech talks in &lt;a href=&#34;/2010/03/28/my-bibletech-2010-talk/&#34;&gt;2010&lt;/a&gt; and &lt;a href=&#34;/2015/05/06/my-bibletech-2015-talk/&#34;&gt;2015&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;I’ve taken the opportunity to generalise and decouple the code (especially with regard to the different chunking systems) and also make use of newer Python stuff like &lt;code&gt;Counter&lt;/code&gt; and dictionary comprehensions which simplifies much of my earlier code.&lt;/p&gt;
&lt;p&gt;There are a lot of little things you can do with just a couple of lines of Python and I’ve tried to avoid turning those into their own library of tiny functions. Instead, I’m compiling a little tutorial / cookbook as I go which you can read the beginnings of here:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/vocabulary-tools/blob/master/examples.rst&#34;&gt;https://github.com/jtauber/vocabulary-tools/blob/master/examples.rst&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There’s still a fair bit more to move over (even going back 11 years to some stuff from 2008) but let me know if you have any feedback, questions, or suggestions. I’m generalising more and more as I go so expect some things to change dramatically.&lt;/p&gt;
&lt;p&gt;If you’re interested in playing around with this stuff for corpora in other languages, let me know how I can help you get up and running. The main requirement is a tokenised and lemmatised corpus (assuming you want to work with lemmas, not surface forms, as vocabulary items) and also some form of chunking information. See &lt;a href=&#34;https://github.com/jtauber/vocabulary-tools/tree/master/gnt_data&#34;&gt;https://github.com/jtauber/vocabulary-tools/tree/master/gnt_data&lt;/a&gt; for the GNT-specific stuff that would (at least partly) need to be replicated for another corpus.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">One of my goals for 2019 is to bring more structure to various disperate Greek projects and, as part of that, I’ve started consolidating multiple one-off projects I’ve done around vocabulary coverage statistics and ordering experiments.</summary>
  </entry><entry>
    <title type="html">Initial Apostolic Fathers Text Complete</title>
    <link href="https://jktauber.com/2019/02/01/initial-apostolic-fathers-text-complete/" rel="alternate" type="text/html" title="Initial Apostolic Fathers Text Complete"/>
    <published>2019-02-01</published>
    <updated>2019-02-01</updated>
    <id>https://jktauber.com/2019/02/01/initial-apostolic-fathers-text-complete</id>
    <content type="html" xml:base="https://jktauber.com/2019/02/01/initial-apostolic-fathers-text-complete/">&lt;p&gt;Exactly three months ago to the day, I announced that Seumas Macdonald and I were working on a corrected, open, digital edition of the Apostolic Fathers based on Lake. That initial work is now complete.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;/2018/11/01/preparing-open-apostolic-fathers/&#34;&gt;Preparing an Open Apostolic Fathers&lt;/a&gt; discussed the original motivation and the rather detailed process we went through.&lt;/p&gt;
&lt;p&gt;The corrected raw text files are available on GitHub at &lt;a href=&#34;https://github.com/jtauber/apostolic-fathers&#34;&gt;https://github.com/jtauber/apostolic-fathers&lt;/a&gt; but I also generated a static site at &lt;a href=&#34;https://jtauber.github.io/apostolic-fathers/&#34;&gt;https://jtauber.github.io/apostolic-fathers/&lt;/a&gt; to browse the texts. The corrections will be contributed back to the OGL First1KGreek project.&lt;/p&gt;
&lt;p&gt;The next step for us will be to lemmatise the text and there has already been some interest from others in getting the English translation corrected and aligned as well.&lt;/p&gt;
&lt;p&gt;Recall that, while we were essentially correcting the Open Greek and Latin text, we used the CCEL text and that in Logos to identify particular places to look at in the printed text. We did this by lining up the CCEL, OGL and Logos texts and seeing where any of them disagreed. Those became the places we went back to, in multiple scans of the printed Lake, to make our corrections to the base text we started with from OGL.&lt;/p&gt;
&lt;p&gt;How often did each of those three &#34;witnesses&#34; disagree? Here are some stats. &lt;strong&gt;A&lt;/strong&gt; = CCEL, &lt;strong&gt;B&lt;/strong&gt; = OGL, &lt;strong&gt;C&lt;/strong&gt; = Logos. And so &lt;strong&gt;AB/C&lt;/strong&gt; is where CCEL and OGL agreed against Logos, &lt;strong&gt;AC/B&lt;/strong&gt; is where CCEL and Logos agreed against OGL, &lt;strong&gt;A/BC&lt;/strong&gt; is where OGL and Logos agreed against CCEL, and &lt;strong&gt;A/B/C&lt;/strong&gt; is where all three disagreed.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;FILE&lt;/th&gt;
&lt;th&gt;AB/C&lt;/th&gt;
&lt;th&gt;AC/B&lt;/th&gt;
&lt;th&gt;A/BC&lt;/th&gt;
&lt;th&gt;A/B/C&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;001&lt;/td&gt;
&lt;td&gt;1.29%&lt;/td&gt;
&lt;td&gt;1.15%&lt;/td&gt;
&lt;td&gt;7.97%&lt;/td&gt;
&lt;td&gt;0.32%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;002&lt;/td&gt;
&lt;td&gt;0.76%&lt;/td&gt;
&lt;td&gt;1.20%&lt;/td&gt;
&lt;td&gt;3.39%&lt;/td&gt;
&lt;td&gt;0.37%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;003&lt;/td&gt;
&lt;td&gt;1.58%&lt;/td&gt;
&lt;td&gt;2.20%&lt;/td&gt;
&lt;td&gt;4.97%&lt;/td&gt;
&lt;td&gt;0.28%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;004&lt;/td&gt;
&lt;td&gt;0.57%&lt;/td&gt;
&lt;td&gt;1.33%&lt;/td&gt;
&lt;td&gt;7.01%&lt;/td&gt;
&lt;td&gt;0.28%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;005&lt;/td&gt;
&lt;td&gt;1.05%&lt;/td&gt;
&lt;td&gt;1.79%&lt;/td&gt;
&lt;td&gt;6.21%&lt;/td&gt;
&lt;td&gt;0.84%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;006&lt;/td&gt;
&lt;td&gt;0.88%&lt;/td&gt;
&lt;td&gt;1.18%&lt;/td&gt;
&lt;td&gt;7.54%&lt;/td&gt;
&lt;td&gt;0.69%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;007&lt;/td&gt;
&lt;td&gt;0.39%&lt;/td&gt;
&lt;td&gt;0.88%&lt;/td&gt;
&lt;td&gt;3.34%&lt;/td&gt;
&lt;td&gt;0.20%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;008&lt;/td&gt;
&lt;td&gt;0.79%&lt;/td&gt;
&lt;td&gt;0.87%&lt;/td&gt;
&lt;td&gt;5.41%&lt;/td&gt;
&lt;td&gt;0.44%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;009&lt;/td&gt;
&lt;td&gt;0.25%&lt;/td&gt;
&lt;td&gt;1.53%&lt;/td&gt;
&lt;td&gt;2.68%&lt;/td&gt;
&lt;td&gt;0.38%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;010&lt;/td&gt;
&lt;td&gt;0.44%&lt;/td&gt;
&lt;td&gt;4.05%&lt;/td&gt;
&lt;td&gt;4.36%&lt;/td&gt;
&lt;td&gt;0.25%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;011&lt;/td&gt;
&lt;td&gt;0.36%&lt;/td&gt;
&lt;td&gt;1.86%&lt;/td&gt;
&lt;td&gt;4.23%&lt;/td&gt;
&lt;td&gt;0.14%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;012&lt;/td&gt;
&lt;td&gt;0.92%&lt;/td&gt;
&lt;td&gt;1.15%&lt;/td&gt;
&lt;td&gt;5.59%&lt;/td&gt;
&lt;td&gt;0.43%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;013&lt;/td&gt;
&lt;td&gt;1.29%&lt;/td&gt;
&lt;td&gt;0.90%&lt;/td&gt;
&lt;td&gt;6.08%&lt;/td&gt;
&lt;td&gt;0.34%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;014&lt;/td&gt;
&lt;td&gt;1.25%&lt;/td&gt;
&lt;td&gt;0.34%&lt;/td&gt;
&lt;td&gt;4.91%&lt;/td&gt;
&lt;td&gt;0.08%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;015&lt;/td&gt;
&lt;td&gt;0.96%&lt;/td&gt;
&lt;td&gt;0.65%&lt;/td&gt;
&lt;td&gt;6.74%&lt;/td&gt;
&lt;td&gt;0.50%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TOTAL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.11%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.12%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5.98%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.34%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;One can immediately see CCEL diverged the most from the others (it had considerable lacunae for a start). The numbers involving Logos diverging are probably overly high because there was a weird systemic error we only noticed after work had started that a middle dot was often erroneously added after eta. This ultimately didn&#39;t affect anything other than perhaps flagging places Seumas and I had to check that we otherwise wouldn&#39;t have needed to.&lt;/p&gt;
&lt;p&gt;But at the end of the day, how much did we change? How much of the OGL original remained? How similar was our result to the text on CCEL? And for a bit of fun, how often was my first correction and Seumas&#39;s first correction the same as what we ended up with after consensus was achieved? Here&#39;s the breakdown by work:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;FILE&lt;/th&gt;
&lt;th&gt;CCEL&lt;/th&gt;
&lt;th&gt;OGL&lt;/th&gt;
&lt;th&gt;JT&lt;/th&gt;
&lt;th&gt;SM&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;001&lt;/td&gt;
&lt;td&gt;91.27%&lt;/td&gt;
&lt;td&gt;99.02%&lt;/td&gt;
&lt;td&gt;99.85%&lt;/td&gt;
&lt;td&gt;99.91%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;002&lt;/td&gt;
&lt;td&gt;96.02%&lt;/td&gt;
&lt;td&gt;98.90%&lt;/td&gt;
&lt;td&gt;99.77%&lt;/td&gt;
&lt;td&gt;99.90%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;003&lt;/td&gt;
&lt;td&gt;94.58%&lt;/td&gt;
&lt;td&gt;97.63%&lt;/td&gt;
&lt;td&gt;99.77%&lt;/td&gt;
&lt;td&gt;99.60%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;004&lt;/td&gt;
&lt;td&gt;92.42%&lt;/td&gt;
&lt;td&gt;98.48%&lt;/td&gt;
&lt;td&gt;99.91%&lt;/td&gt;
&lt;td&gt;100.00%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;005&lt;/td&gt;
&lt;td&gt;92.32%&lt;/td&gt;
&lt;td&gt;98.32%&lt;/td&gt;
&lt;td&gt;99.79%&lt;/td&gt;
&lt;td&gt;99.89%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;006&lt;/td&gt;
&lt;td&gt;91.28%&lt;/td&gt;
&lt;td&gt;98.82%&lt;/td&gt;
&lt;td&gt;98.82%&lt;/td&gt;
&lt;td&gt;99.80%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;007&lt;/td&gt;
&lt;td&gt;96.07%&lt;/td&gt;
&lt;td&gt;98.92%&lt;/td&gt;
&lt;td&gt;99.90%&lt;/td&gt;
&lt;td&gt;99.90%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;008&lt;/td&gt;
&lt;td&gt;93.89%&lt;/td&gt;
&lt;td&gt;99.30%&lt;/td&gt;
&lt;td&gt;100.00%&lt;/td&gt;
&lt;td&gt;99.91%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;009&lt;/td&gt;
&lt;td&gt;96.82%&lt;/td&gt;
&lt;td&gt;99.75%&lt;/td&gt;
&lt;td&gt;98.60%&lt;/td&gt;
&lt;td&gt;99.87%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;010&lt;/td&gt;
&lt;td&gt;94.94%&lt;/td&gt;
&lt;td&gt;96.27%&lt;/td&gt;
&lt;td&gt;99.87%&lt;/td&gt;
&lt;td&gt;99.68%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;011&lt;/td&gt;
&lt;td&gt;95.04%&lt;/td&gt;
&lt;td&gt;98.54%&lt;/td&gt;
&lt;td&gt;99.77%&lt;/td&gt;
&lt;td&gt;99.91%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;012&lt;/td&gt;
&lt;td&gt;93.86%&lt;/td&gt;
&lt;td&gt;98.78%&lt;/td&gt;
&lt;td&gt;99.87%&lt;/td&gt;
&lt;td&gt;99.90%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;013&lt;/td&gt;
&lt;td&gt;93.15%&lt;/td&gt;
&lt;td&gt;99.20%&lt;/td&gt;
&lt;td&gt;99.87%&lt;/td&gt;
&lt;td&gt;99.83%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;014&lt;/td&gt;
&lt;td&gt;94.90%&lt;/td&gt;
&lt;td&gt;99.62%&lt;/td&gt;
&lt;td&gt;99.92%&lt;/td&gt;
&lt;td&gt;99.74%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;015&lt;/td&gt;
&lt;td&gt;92.69%&lt;/td&gt;
&lt;td&gt;99.16%&lt;/td&gt;
&lt;td&gt;99.96%&lt;/td&gt;
&lt;td&gt;99.62%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TOTAL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;93.32%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;98.97%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;99.83%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;99.84%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;You just beat me Seumas :-)&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Exactly three months ago to the day, I announced that Seumas Macdonald and I were working on a corrected, open, digital edition of the Apostolic Fathers based on Lake. That initial work is now complete.</summary>
  </entry><entry>
    <title type="html">More Thoughts on Different Morphological Analyses</title>
    <link href="https://jktauber.com/2019/01/14/more-thoughts-different-morphological-analyses/" rel="alternate" type="text/html" title="More Thoughts on Different Morphological Analyses"/>
    <published>2019-01-14</published>
    <updated>2019-01-14</updated>
    <id>https://jktauber.com/2019/01/14/more-thoughts-different-morphological-analyses</id>
    <content type="html" xml:base="https://jktauber.com/2019/01/14/more-thoughts-different-morphological-analyses/">&lt;p&gt;In &lt;a href=&#34;/2018/12/10/five-types-morphological-analysis/&#34;&gt;Five Types of Morphological Analysis&lt;/a&gt; I outlined five distinct ways of approaching morphological (or potentially any linguistic) analysis. In support of some of these, I have some additional examples from a pair of papers I&#39;m reading and a conference I just attended.&lt;/p&gt;
&lt;p&gt;Baayen et al (2018) (co-written by Jim Blevins, my undergraduate advisor from 25 years ago and still a mentor), in describing their own word-based, discriminative approach to morphology, contrast it with both widespread morpheme-based approaches and increasingly popular exponent-focused realizational approaches. I&#39;ll leave a discussion of these different approaches to another time, but what is relevant to my previous post is this comment:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[morpheme-based and realizational analyses] may be of practical value, especially in the context of adult second language acquisition. It is less clear whether the corresponding theories, whose practical utility derives ultimately from their pedagogical origins, can be accorded any cognitive plausibility.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Note the distinction they are making between analyses of practical (adult SLA, pedagogical) value and cognitive plausibility.&lt;/p&gt;
&lt;p&gt;Again, it&#39;s not the point of this post to describe (much less assess) their arguments for why morphemes and exponents might not be cognitively plausible and what the alternative is, merely that they acknowledge certain analyses might be useful for pedagogical purposes independent of their cognitive plausibility (thereby agreeing with my &lt;strong&gt;psychological&lt;/strong&gt; vs &lt;strong&gt;pedagogical&lt;/strong&gt; distinction).&lt;/p&gt;
&lt;p&gt;Perhaps &lt;strong&gt;cognitive&lt;/strong&gt; would be another word for my &lt;strong&gt;psychological&lt;/strong&gt; category.&lt;/p&gt;
&lt;p&gt;They furthermore suggest:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Constructional schemata, inheritance, and mechanisms spelling out exponents are all products of descriptive traditions that evolved without any influence from research traditions in psychology. As a consequence, it is not self-evident that these notions would provide an adequate characterization of the representations and processes underlying comprehension and production. It seems particularly implausible that children would be motivated to replicate the descriptive scaffolding of [these] theoretical accounts...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Terms like &#34;descriptive traditions&#34; and &#34;descriptive scaffolding of theoretical accounts&#34; refer to what I had in mind with my &lt;strong&gt;synchronic&lt;/strong&gt; category of analysis. Perhaps &lt;strong&gt;descriptive&lt;/strong&gt; and &lt;strong&gt;theoretical&lt;/strong&gt; would be other words for that category.&lt;/p&gt;
&lt;p&gt;In a related paper, Baayen et al (2019), they talk about three possible responses to the challenge posed to linguistics (or at least linguistically-informed natural language processing) by the success of machine learning.&lt;/p&gt;
&lt;p&gt;Αgain it&#39;s outside the scope of this post to get into those details, but in short, their suggested possible responses are: (1) admit defeat, (2) claim the hidden layers reflect traditional linguistic representations, (3) rethink the nature of language processing in the brain. They go on to explore the third option in the context of morphology and the lexicon, stating that&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the model that we propose here brings together several strands of research across theoretical morphology, psychology, and machine learning.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Note that this is essentially a claim that it&#39;s possible to reconcile at least three of the different approaches I&#39;ve outlined: the synchronic/description/theoretical, the cognitive/psychological, and the algorithmic/machine-learning.&lt;/p&gt;
&lt;p&gt;(Missing here is any reference to diachrony or pedagogy, which I think they would agree are distinct approaches to what they are attempting to unify).&lt;/p&gt;
&lt;p&gt;Now last week, I attended the Society for Computation in Linguistics meeting, coinciding with the big annual meeting of the Linguistic Society of America. One of the goals of SCiL is to build bridges from the NLP community to the linguistics community so it was of particular interest to me.&lt;/p&gt;
&lt;p&gt;But again one of the big things that came up in multiple talks was distinct approaches: the approach of the NLP practitioners, often referred to as the &lt;strong&gt;engineering&lt;/strong&gt; approach, and that of the linguists, often referred to as the &lt;strong&gt;scientific&lt;/strong&gt; approach. At their most self-deprecating, the NLP practioners confessed their over-obsession with metrics on &#34;tasks&#34; and lack of regard for the underlying scientific &#34;questions&#34;. Noah Smith, in fact, joked that NLPers can annoy linguists by asking what their &#34;task&#34; is and linguists can annoy NLPers by asking what their &#34;question&#34; is.&lt;/p&gt;
&lt;p&gt;The point of mentioning this is yet another example of a difference in approach and perspective.&lt;/p&gt;
&lt;p&gt;Diachrony didn&#39;t feature at all in either the Baayen/Blevins papers nor at SCiL, but certainly my other distinctions seem more broadly confirmed (albeit with alternative terminology). So I think we have:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;algorithmic&lt;/strong&gt; / &lt;strong&gt;engineering&lt;/strong&gt; / &lt;strong&gt;task-oriented&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;diachronic&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;synchronic&lt;/strong&gt; / &lt;strong&gt;descriptive&lt;/strong&gt; / &lt;strong&gt;theoretical&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;psychological&lt;/strong&gt; / &lt;strong&gt;cognitive&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;pedagogical&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now this is not to say some of these approaches can&#39;t be combined (as shown in the Baayen/Blevins papers). But even when one is attempting to combine some of them, I think it&#39;s useful to acknowledge (a) the multiple approaches being combined; (b) other approaches with distinct goals and evaluation procedures that aren&#39;t being consisdered but which may still be valuable in other contexts.&lt;/p&gt;
&lt;p&gt;At the end of the day, I&#39;m trying to turn arguments of the form &#34;that isn&#39;t a good theory/description/implementation/explanation of morphology&#34; into a more nuanced &#34;it probably isn&#39;t good for this but it might be good for that&#34;.&lt;/p&gt;
&lt;h2&gt;References&lt;/h2&gt;
&lt;p&gt;Baayen, R. H., Chuang, Y. Y., and Blevins, J. P. (2018). Inflectional morphology with linear mappings. The Mental Lexicon, 13 (2), 232-270.&lt;/p&gt;
&lt;p&gt;Baayen, R. H., Chuang, Y. Y., Shafaei-Bajestan E., and Blevins, J. P. (2019). The discriminative lexicon: A unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (de)composition but in linear discriminative learning. Complexity, 2019, 1-39.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In &lt;a href=&#34;/2018/12/10/five-types-morphological-analysis/&#34;&gt;Five Types of Morphological Analysis&lt;/a&gt; I outlined five distinct ways of approaching morphological (or potentially any linguistic) analysis. In support of some of these, I have some additional examples from a pair of papers I&#39;m reading and a conference I just attended.</summary>
  </entry><entry>
    <title type="html">Five Types of Morphological Analysis</title>
    <link href="https://jktauber.com/2018/12/10/five-types-morphological-analysis/" rel="alternate" type="text/html" title="Five Types of Morphological Analysis"/>
    <published>2018-12-10</published>
    <updated>2018-12-10</updated>
    <id>https://jktauber.com/2018/12/10/five-types-morphological-analysis</id>
    <content type="html" xml:base="https://jktauber.com/2018/12/10/five-types-morphological-analysis/">&lt;p&gt;People talking about morphological analyses can often speak across each other because they have different purposes in mind. Here&#39;s an initial attempt to outline five possibly distinct notions one might be referring to.&lt;/p&gt;
&lt;p&gt;I&#39;m tentatively labelling them:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;algorithmic&lt;/li&gt;
&lt;li&gt;diachronic&lt;/li&gt;
&lt;li&gt;synchronic&lt;/li&gt;
&lt;li&gt;psychological&lt;/li&gt;
&lt;li&gt;pedagogical&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;although the labels matter less than being clear about the distinction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Algorithmic&lt;/strong&gt; means I can go from an inflected form to a lemma + morphosyntactic properties (or vice versa) efficiently on a computer. The way this is achieved might not be psychologically plausible or historically accurate but it can be implemented in software to get the job done.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Diachronic&lt;/strong&gt; means I can explain (or at least speculate) how the inflected form came about: what the roots are, what grammaticalisation took place, what sound changes explain seeming irregularities, etc.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Synchronic&lt;/strong&gt; means I can describe the inflected forms without recourse to historical data or reconstruction. This might focus on perspicuity rather than computational efficiency or psychological plausibility.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Psychological&lt;/strong&gt; means the analysis is consistent with what I think is (or was) going on in the minds of native speakers. Some people may equate this with syncronic analyses but I think you can have a psychologically implausible yet still descriptively adequate synchronic analysis.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pedagogical&lt;/strong&gt; means a useful way of explaining it to students. This &lt;em&gt;may&lt;/em&gt; be diachronic, but might be more synchronic (whether psychologically plausible or not).&lt;/p&gt;
&lt;p&gt;Analyses can obviously be compatible with more than one of these. But I think it&#39;s helpful to be clear what the goals of any morphological description are. If the goal is to lemmatise and tag a new text, then psychological or historical plausibility, or analytical or pedagogical clarity might not matter. If one&#39;s goal is a diachronically-informed analysis to help students, it should be clear why an otherwise perfectly adequate morphological parser might not be producing useful information.&lt;/p&gt;
&lt;p&gt;Those who have been following my &lt;em&gt;Tour of Greek Morphology&lt;/em&gt; know I&#39;ve tried to be careful distinguishing, for example, historical explanations from how I think native speakers internalise(d) word forms, or how students should learn them.&lt;/p&gt;
&lt;p&gt;I still come across a lot of people who think the &#34;modern&#34; way of understanding morphology is learning the &#34;morphemes&#34; and rules, not memorising paradigms. Besides getting the history somewhat wrong, this is also making the mistake of conflating these different types of analyses and not recognising that one type of analysis might be perfectly valid for one purpose but not another.&lt;/p&gt;
&lt;p&gt;Here&#39;s a fun game to play: how would you analyse/explain the form λαμβάνω? Or ἔλαβον (especially when 3rd plural) or λήμψομαι? Or μαθητής vs μαθητοῦ? Or ἔδωκεν vs δέδωκα vs δός?&lt;/p&gt;
&lt;p&gt;Maybe I haven&#39;t quite nailed the labels yet. Maybe there are further distinctions to draw. I welcome people&#39;s input.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">People talking about morphological analyses can often speak across each other because they have different purposes in mind. Here&#39;s an initial attempt to outline five possibly distinct notions one might be referring to.</summary>
  </entry><entry>
    <title type="html">Preparing an Open Apostolic Fathers</title>
    <link href="https://jktauber.com/2018/11/01/preparing-open-apostolic-fathers/" rel="alternate" type="text/html" title="Preparing an Open Apostolic Fathers"/>
    <published>2018-11-01</published>
    <updated>2018-11-01</updated>
    <id>https://jktauber.com/2018/11/01/preparing-open-apostolic-fathers</id>
    <content type="html" xml:base="https://jktauber.com/2018/11/01/preparing-open-apostolic-fathers/">&lt;p&gt;I&#39;m working with Seumas Macdonald on an open, corrected digital edition of the Apostolic Fathers based on Lake.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://thepatrologist.com&#34;&gt;Seumas Macdonald&lt;/a&gt; asked me a few weeks ago what it would take to expand some of our text and vocab ordering experiments to the text of Apostolic Fathers (we&#39;re both desirous of more comprehensible input for Greek learners).&lt;/p&gt;
&lt;p&gt;My reply was that we first of all needed to get a good open text and then lemmatise it. I thought the &#34;get a good open text&#34; would be trivial but it turned out not to be.&lt;/p&gt;
&lt;p&gt;I asked around without much positive response. I found HTML versions of the Lake texts on the &lt;a href=&#34;https://www.ccel.org&#34;&gt;Christian Classics Ethereal Library&lt;/a&gt; (CCEL) website but they turned out to be problematic quality-wise (see below).&lt;/p&gt;
&lt;p&gt;It then occurred to me to check what was in the &lt;a href=&#34;http://www.perseus.tufts.edu/hopper/&#34;&gt;Perseus Digital Library&lt;/a&gt;. It only had the Epistle of Barnabas but the related &lt;a href=&#34;http://opengreekandlatin.github.io/First1KGreek/&#34;&gt;First 1000 Years of Greek&lt;/a&gt; at the Open Greek and Latin Project had done the rest.&lt;/p&gt;
&lt;p&gt;The Perseus/OGL texts were considerably better than the CCEL ones, but were still not without problems. It was clear that the two collections had been produced independently, however, which is important for what follows.&lt;/p&gt;
&lt;p&gt;I&#39;m almost certain the CCEL texts were keyed in. There is haplography and dittography galore! The hapolography even corresponds almost perfectly to line breaks in the printed Lake editions I looked at.&lt;/p&gt;
&lt;p&gt;The Perseus/OGL texts, on the other hand, are the results of OCR with some manual correction.&lt;/p&gt;
&lt;p&gt;I wrote some code to extract both the CCEL and Perseus/OGL texts and put them in a comparable format. I then wrote a script to align the two. My thinking was to go through all the places where the two disagreed, check the printed Lake and correct the Perseus/OGL text accordingly.&lt;/p&gt;
&lt;p&gt;I decided to throw the Lake text from Logos into the mix as well, not as an input to the correction itself but merely as another &#34;edition&#34; to flag differences with (to then check with the printed Lake).&lt;/p&gt;
&lt;p&gt;Thus began a project Seumas and I have been working on the last few weeks. Once differences in any of the three texts are identified, they are flagged for review and Seumas and I independently look at the printed Lake and correct the Perseus/OGL base text.&lt;/p&gt;
&lt;p&gt;If our corrections disagree, we continue to work on them until we come to consensus. This three-way comparison followed by two-way independent correction is proving to work very well (although it&#39;s a lot of work!)&lt;/p&gt;
&lt;p&gt;All the code, the source texts (except Logos), and work-in-progress are available at&lt;/p&gt;
&lt;p&gt;https://github.com/jtauber/apostolic-fathers&lt;/p&gt;
&lt;p&gt;and you can follow along the status in the README. There are also more detailed notes on the whole process.&lt;/p&gt;
&lt;p&gt;Once the candidate versions of all the texts are published, I&#39;ll do another post just with some interesting statistics on the nature of errors in the CCEL, Perseus/OGL, and Logos texts. The &#34;scribal errors&#34; in the CCEL text are particularly fascinating but even some of the Perseus/OGL OCR errors will be worth writing about.&lt;/p&gt;
&lt;p&gt;Seumas and I will then contribute back the corrections to CCEL, Perseus/OGL, and Logos. Hopefully our texts will also be featured on the &lt;a href=&#34;http://biblicalhumanities.org/dashboard/&#34;&gt;Biblical Humanities Dashboard&lt;/a&gt; as the go-to open digital text of the Apostolic Fathers (so no one else has to repeat this effort).&lt;/p&gt;
&lt;p&gt;Finally, we&#39;ll start the process of lemmatisation so the Apostolic Fathers can be included in our open learning materials.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;m working with Seumas Macdonald on an open, corrected digital edition of the Apostolic Fathers based on Lake.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 27</title>
    <link href="https://jktauber.com/2018/10/18/tour-greek-morphology-part-27/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 27"/>
    <published>2018-10-18</published>
    <updated>2018-10-18</updated>
    <id>https://jktauber.com/2018/10/18/tour-greek-morphology-part-27</id>
    <content type="html" xml:base="https://jktauber.com/2018/10/18/tour-greek-morphology-part-27/">&lt;p&gt;Part twenty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Let&#39;s finish our survey of imperfect middle endings in the indicative with the athematic verbs.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-6&lt;/th&gt;
&lt;th&gt;IM-7&lt;/th&gt;
&lt;th&gt;IM-8&lt;/th&gt;
&lt;th&gt;IM-9&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xύμην&lt;/td&gt;
&lt;td&gt;Xέμην&lt;/td&gt;
&lt;td&gt;Xόμην&lt;/td&gt;
&lt;td&gt;Xάμην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυσο&lt;/td&gt;
&lt;td&gt;Xεσο&lt;/td&gt;
&lt;td&gt;Xοσο&lt;/td&gt;
&lt;td&gt;Xασο/Xω&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυτο&lt;/td&gt;
&lt;td&gt;Xετο&lt;/td&gt;
&lt;td&gt;Xοτο&lt;/td&gt;
&lt;td&gt;Xατο&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xύμεθα&lt;/td&gt;
&lt;td&gt;Xέμεθα&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xάμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυσθε&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xοσθε&lt;/td&gt;
&lt;td&gt;Xασθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυντο&lt;/td&gt;
&lt;td&gt;Xεντο&lt;/td&gt;
&lt;td&gt;Xοντο&lt;/td&gt;
&lt;td&gt;Xαντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The classes are similar to their &lt;strong&gt;IA-&lt;/strong&gt; equivalents except there is no ablaut between the singular and plural.&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;IM-6&lt;/strong&gt; | -νυ- verbs like δείκνυμι          | stem ends in ῠ
| &lt;strong&gt;IM-7&lt;/strong&gt; | τίθημι, ἵημι and their compounds  | stem ends in ε
| &lt;strong&gt;IM-8&lt;/strong&gt; | δίδωμι and compounds              | stem ends in ο
| &lt;strong&gt;IM-9&lt;/strong&gt; | ἵστημι and compounds              | stem ends in ᾰ&lt;/p&gt;
&lt;p&gt;The intervocalic sigma in &lt;strong&gt;2SG&lt;/strong&gt; generally does not drop out in the athematics although it sometimes can, particularly in &lt;strong&gt;IM-9&lt;/strong&gt; which seems to be the class most starting to merge with the thematics. Note, though, that the lack of circumflex in this case eliminates confusion with an &lt;strong&gt;IM-4&lt;/strong&gt; &lt;strong&gt;2SG&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The lack of circumflex in the &lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; also eliminates confusion with &lt;strong&gt;IM-4&lt;/strong&gt; in those cells.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IM-7&lt;/strong&gt; can be confused for &lt;strong&gt;IM-1&lt;/strong&gt; in the &lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt;, though.&lt;/p&gt;
&lt;p&gt;In the next few posts we&#39;ll summarise the inference rules and ambiguities for the imperfect and look at some type and token frequencies, just like we did for the present.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 26</title>
    <link href="https://jktauber.com/2018/09/08/tour-greek-morphology-part-26/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 26"/>
    <published>2018-09-08</published>
    <updated>2018-09-08</updated>
    <id>https://jktauber.com/2018/09/08/tour-greek-morphology-part-26</id>
    <content type="html" xml:base="https://jktauber.com/2018/09/08/tour-greek-morphology-part-26/">&lt;p&gt;Part twenty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;We&#39;ve looked at the imperfect endings for the thematic actives and middles. Now let&#39;s look at the athematic active endings.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-6&lt;/th&gt;
&lt;th&gt;IA-7&lt;/th&gt;
&lt;th&gt;IA-8&lt;/th&gt;
&lt;th&gt;IA-9&lt;/th&gt;
&lt;th&gt;IA-9b&lt;/th&gt;
&lt;th&gt;IA-10&lt;/th&gt;
&lt;th&gt;IA-11&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῡν&lt;/td&gt;
&lt;td&gt;Xην/Xειν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xην&lt;/td&gt;
&lt;td&gt;Xην&lt;/td&gt;
&lt;td&gt;ἦ/ἦν&lt;/td&gt;
&lt;td&gt;ᾖα/ᾔειν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῡς&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xους&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;td&gt;Xης/Xησθα&lt;/td&gt;
&lt;td&gt;ἦς/ἦσθα&lt;/td&gt;
&lt;td&gt;ᾔεις/ᾔεισθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xῡ&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;td&gt;ἦν&lt;/td&gt;
&lt;td&gt;ᾔει/ᾔειν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυμεν&lt;/td&gt;
&lt;td&gt;Xεμεν&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xαμεν&lt;/td&gt;
&lt;td&gt;Xαμεν&lt;/td&gt;
&lt;td&gt;ἦμεν&lt;/td&gt;
&lt;td&gt;ᾖμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυτε&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xοτε&lt;/td&gt;
&lt;td&gt;Xατε&lt;/td&gt;
&lt;td&gt;Xατε&lt;/td&gt;
&lt;td&gt;ἦτε&lt;/td&gt;
&lt;td&gt;ᾖτε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xυσαν&lt;/td&gt;
&lt;td&gt;Xεσαν&lt;/td&gt;
&lt;td&gt;Xοσαν&lt;/td&gt;
&lt;td&gt;Xασαν&lt;/td&gt;
&lt;td&gt;Xασαν&lt;/td&gt;
&lt;td&gt;ἦσαν&lt;/td&gt;
&lt;td&gt;ᾖσαν/ᾔεσαν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;IA-6&lt;/strong&gt; is the -νυ- verbs like δείκνυμι. There is ablaut between the singular and plural (ῡ vs υ).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IA-9&lt;/strong&gt; is ἵστημι and compounds. There is again the expected singular/plural ablaut (η vs α).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IA-8&lt;/strong&gt; is δίδωμι and compounds. There is a vowel alternative but it is ου/ο and not ω/ο ablaut like in the present.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IA-7&lt;/strong&gt; is τίθημι, ἵημι and their compounds. The vowel alternation here is ει/ε and not η/ε ablaut like in the present except for the η in the &lt;strong&gt;1SG&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IA-9b&lt;/strong&gt; is φημί which is like ἵστημι but with the added &lt;strong&gt;2SG&lt;/strong&gt; Xησθα.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IA-10&lt;/strong&gt; and &lt;strong&gt;IA-11&lt;/strong&gt; are εἰμί and εἶμι respectively. The -σθα &lt;strong&gt;2SG&lt;/strong&gt; ending comes up again but there are other differences that we will eventually want to unpack.&lt;/p&gt;
&lt;p&gt;For the most part, the endings follow those of the thematic imperfects. The consistent difference is the &lt;strong&gt;3PL&lt;/strong&gt; -σαν (although see below).&lt;/p&gt;
&lt;p&gt;We&#39;ll save for later posts what&#39;s going on with the -σθα ending and with various parts of the &lt;strong&gt;IA-10&lt;/strong&gt; and &lt;strong&gt;IA-11&lt;/strong&gt; paradigms. But I want to note something intriguing about the unexpected vowel alternations in &lt;strong&gt;IA-7&lt;/strong&gt; and &lt;strong&gt;IA-8&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Xουν ~ Xους ~ Xου is what we see in &lt;strong&gt;IA-3&lt;/strong&gt; and Xεις ~ Xει in &lt;strong&gt;IA-2&lt;/strong&gt;. This suggests that these athematic verbs were starting to be inflected &lt;em&gt;as if&lt;/em&gt; they were thematic.&lt;/p&gt;
&lt;p&gt;Along similar lines, John 21.18 has ἐζώννυες with a theme vowel. Acts 27.1 has παρεδίδουν for the plural (yet παρεδίδοσαν in Acts 16.4).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Back from International Colloquium on Ancient Greek Linguistics</title>
    <link href="https://jktauber.com/2018/09/06/back-international-colloquium-ancient-greek-lingui/" rel="alternate" type="text/html" title="Back from International Colloquium on Ancient Greek Linguistics"/>
    <published>2018-09-06</published>
    <updated>2018-09-06</updated>
    <id>https://jktauber.com/2018/09/06/back-international-colloquium-ancient-greek-lingui</id>
    <content type="html" xml:base="https://jktauber.com/2018/09/06/back-international-colloquium-ancient-greek-lingui/">&lt;p&gt;Last week I attended the ninth International Colloquium on Ancient Greek Linguistics at the University of Helsinki.&lt;/p&gt;
&lt;p&gt;It was an excellent conference with a lot of good linguistic and philological content featuring some nice quantatitive analyses.&lt;/p&gt;
&lt;p&gt;Some of the paper highlights for me:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Paul Kiparsky&lt;/strong&gt; on a regular sound change explanation (via Optimality Theory) for various alternations usually explained via analogy
&lt;br&gt;&lt;a href=&#34;https://www.helsinki.fi/en/conferences/international-colloquium-on-ancient-greek-linguistics/abstracts-a-k#section-58193&#34;&gt;abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Robert Crellin&lt;/strong&gt; on the ambiguity of Greek without vowels as part of an exploration of why Greek introduced written vowels in the first place &lt;br&gt;&lt;a href=&#34;https://www.helsinki.fi/en/conferences/international-colloquium-on-ancient-greek-linguistics/abstracts-a-k#section-58038&#34;&gt;abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Lucien van Beek&lt;/strong&gt; on atelic perfects in Homeric Greek
&lt;br&gt;&lt;a href=&#34;https://www.helsinki.fi/en/conferences/international-colloquium-on-ancient-greek-linguistics/abstracts-k-z#section-58151&#34;&gt;abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;David Goldstein&lt;/strong&gt; on differential agent marking (dative vs prepositional phrase) in Herodotus
&lt;br&gt;&lt;a href=&#34;https://www.helsinki.fi/en/conferences/international-colloquium-on-ancient-greek-linguistics/abstracts-a-k#section-58055&#34;&gt;abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sandra Rodríguez Piedrabuena&lt;/strong&gt; on (im)politeness strategies in Ancient Greek
&lt;br&gt;&lt;a href=&#34;https://www.helsinki.fi/en/conferences/international-colloquium-on-ancient-greek-linguistics/abstracts-k-z#section-58138&#34;&gt;abstract&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I may do individual follow-up posts to some of these as they inspired potential investigations of my own in the future.&lt;/p&gt;
&lt;p&gt;It was also great just catching up with people I&#39;ve met the last couple of years at Greek and Indo-European conferences at UCLA, Oxford, and Cambridge.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Last week I attended the ninth International Colloquium on Ancient Greek Linguistics at the University of Helsinki.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 25</title>
    <link href="https://jktauber.com/2018/08/25/tour-greek-morphology-part-25/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 25"/>
    <published>2018-08-25</published>
    <updated>2018-08-25</updated>
    <id>https://jktauber.com/2018/08/25/tour-greek-morphology-part-25</id>
    <content type="html" xml:base="https://jktauber.com/2018/08/25/tour-greek-morphology-part-25/">&lt;p&gt;Part twenty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the &lt;a href=&#34;/2018/07/29/tour-greek-morphology-part-24/&#34;&gt;previous part&lt;/a&gt; we looked at the endings of the active imperfects with theme vowels. Now we are going to look at the middles.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IM-1&lt;/th&gt;
&lt;th&gt;IM-2&lt;/th&gt;
&lt;th&gt;IM-3&lt;/th&gt;
&lt;th&gt;IM-4&lt;/th&gt;
&lt;th&gt;IM-5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xόμην&lt;/td&gt;
&lt;td&gt;Xούμην&lt;/td&gt;
&lt;td&gt;Xούμην&lt;/td&gt;
&lt;td&gt;Xώμην&lt;/td&gt;
&lt;td&gt;Xώμην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xοῦ&lt;/td&gt;
&lt;td&gt;Xοῦ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xετο&lt;/td&gt;
&lt;td&gt;Xεῖτο&lt;/td&gt;
&lt;td&gt;Xοῦτο&lt;/td&gt;
&lt;td&gt;Xᾶτο&lt;/td&gt;
&lt;td&gt;Xῆτο&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xεῖσθε&lt;/td&gt;
&lt;td&gt;Xοῦσθε&lt;/td&gt;
&lt;td&gt;Xᾶσθε&lt;/td&gt;
&lt;td&gt;Xῆσθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xοντο&lt;/td&gt;
&lt;td&gt;Xοῦντο&lt;/td&gt;
&lt;td&gt;Xοῦντο&lt;/td&gt;
&lt;td&gt;Xῶντο&lt;/td&gt;
&lt;td&gt;Xῶντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The vowel differences between these five different classes of verb should largely be familiar to you by now as they&#39;re pretty much the same pattern we&#39;ve seen in the present active, present middle, and imperfect active—namely:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;-2&lt;/strong&gt; class historically had an ε before the theme vowel and this led (depending on whether the theme vowel was ε or ο) to ει or ου&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;-3&lt;/strong&gt; class historically had an ο before the theme vowel and this led (regardless of whether the theme vowel was ε or ο) to ου&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;-4&lt;/strong&gt; class historically had an α before the theme vowel and this led (depending on whether the theme vowel was ε or ο) to ω or ᾱ&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;-5&lt;/strong&gt; class is like the &lt;strong&gt;-4&lt;/strong&gt; class but with a η for the ᾱ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One difference in the above table from what we&#39;ve seen before is that the &lt;strong&gt;2SG&lt;/strong&gt; ending is identical between &lt;strong&gt;IM-2&lt;/strong&gt; and &lt;strong&gt;IM-3&lt;/strong&gt; and between &lt;strong&gt;IM-4&lt;/strong&gt; and &lt;strong&gt;IM-5&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The fact the distinguisher is a bare diphthong might remind you of the &lt;strong&gt;2SG&lt;/strong&gt; in the present middle, which in &lt;a href=&#34;/2017/07/23/tour-greek-morphology-part-9/&#34;&gt;part 9&lt;/a&gt; we partially explained as historically coming from a dropped intervocalic sigma (e.g. ε+σαι &amp;gt; εαι &amp;gt; ηι &amp;gt; ῃ). This is indeed what happened here too.&lt;/p&gt;
&lt;p&gt;The pattern is clearer put alongside the &lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt; as well.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PM-1&lt;/th&gt;
&lt;th&gt;IM-1&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ε+σαι &amp;gt; ῃ&lt;/td&gt;
&lt;td&gt;ε+σο &amp;gt; ου&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ε+ται&lt;/td&gt;
&lt;td&gt;ε+το&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ο+νται&lt;/td&gt;
&lt;td&gt;ο+ντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;We can see here that, prior to the dropping of the sigma (and subsequent contraction) to a long-ο written as a spurious diphthong ου, the present and imperfect endings in the &lt;strong&gt;2SG&lt;/strong&gt;, &lt;strong&gt;3SG&lt;/strong&gt;, and &lt;strong&gt;3PL&lt;/strong&gt; just differed in a final αι/ο alternation (which is tantalisingly close to just a iota/no-iota alternation like we might expect).&lt;/p&gt;
&lt;p&gt;If we try to summarise the historical origins of the personal endings, we might get something like the following:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA&lt;/th&gt;
&lt;th&gt;IA&lt;/th&gt;
&lt;th&gt;PM&lt;/th&gt;
&lt;th&gt;IM&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;μι&lt;/td&gt;
&lt;td&gt;μ&lt;/td&gt;
&lt;td&gt;μαι&lt;/td&gt;
&lt;td&gt;μην&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;σι&lt;/td&gt;
&lt;td&gt;σ&lt;/td&gt;
&lt;td&gt;σαι&lt;/td&gt;
&lt;td&gt;σο&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;τι&lt;/td&gt;
&lt;td&gt;τ&lt;/td&gt;
&lt;td&gt;ται&lt;/td&gt;
&lt;td&gt;το&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;μεν&lt;/td&gt;
&lt;td&gt;μεν&lt;/td&gt;
&lt;td&gt;μεθα&lt;/td&gt;
&lt;td&gt;μεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;τε&lt;/td&gt;
&lt;td&gt;τε&lt;/td&gt;
&lt;td&gt;σθε&lt;/td&gt;
&lt;td&gt;σθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ντι&lt;/td&gt;
&lt;td&gt;ντ&lt;/td&gt;
&lt;td&gt;νται&lt;/td&gt;
&lt;td&gt;ντο&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;There is a clear μ/σ/τ/ντ pattern in the &lt;strong&gt;1SG&lt;/strong&gt;/&lt;strong&gt;2SG&lt;/strong&gt;/&lt;strong&gt;3SG&lt;/strong&gt;/&lt;strong&gt;3PL&lt;/strong&gt;. Cross-cutting this there is a clear ι/-/αι/ο pattern in the &lt;strong&gt;PA&lt;/strong&gt;/&lt;strong&gt;IA&lt;/strong&gt;/&lt;strong&gt;PM&lt;/strong&gt;/&lt;strong&gt;IM&lt;/strong&gt;. The exception is the μην in the &lt;strong&gt;IM&lt;/strong&gt; &lt;strong&gt;1SG&lt;/strong&gt; (where we might expect μο).&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; seem to be playing by a different set of rules and notice they don&#39;t make a distinction between the present and imperfect at all.&lt;/p&gt;
&lt;p&gt;Note that this summary of endings, while providing a historical background to the Greek forms we see, is really in the realm of Indo-European comparative linguistics rather than Greek. It&#39;s the foundation to how Ancient Greek came to be the way it was but doesn&#39;t reflect the way native speakers would have internalised inflections nor should be suggestive of the way they should be taught nowadays.&lt;/p&gt;
&lt;p&gt;The goal here is to explain some things once the &lt;em&gt;actual&lt;/em&gt; endings are already familiar.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 24</title>
    <link href="https://jktauber.com/2018/07/29/tour-greek-morphology-part-24/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 24"/>
    <published>2018-07-29</published>
    <updated>2018-07-29</updated>
    <id>https://jktauber.com/2018/07/29/tour-greek-morphology-part-24</id>
    <content type="html" xml:base="https://jktauber.com/2018/07/29/tour-greek-morphology-part-24/">&lt;p&gt;Part twenty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Now let&#39;s look at the imperfect forms corresponding to the active omega verbs we looked at in the present way back in &lt;a href=&#34;/2017/07/02/tour-greek-morphology-part-4/&#34;&gt;part 4&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We&#39;ll use &lt;strong&gt;IA-1&lt;/strong&gt; through &lt;strong&gt;IA-5&lt;/strong&gt; for the distinguisher patterns corresponding to the verbs that followed &lt;strong&gt;PA-1&lt;/strong&gt; through &lt;strong&gt;PA-5&lt;/strong&gt; in the present.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IA-1&lt;/th&gt;
&lt;th&gt;IA-2&lt;/th&gt;
&lt;th&gt;IA-3&lt;/th&gt;
&lt;th&gt;IA-4&lt;/th&gt;
&lt;th&gt;IA-5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xον&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xες&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xους&lt;/td&gt;
&lt;td&gt;Xᾱς&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xε(ν)&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xου&lt;/td&gt;
&lt;td&gt;Xᾱ&lt;/td&gt;
&lt;td&gt;Xη&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xεῖτε&lt;/td&gt;
&lt;td&gt;Xοῦτε&lt;/td&gt;
&lt;td&gt;Xᾶτε&lt;/td&gt;
&lt;td&gt;Xῆτε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Xον&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xουν&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;td&gt;Xων&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Recall:&lt;/p&gt;
&lt;p&gt;| &lt;strong&gt;PA-1&lt;/strong&gt; | barytone omega verbs
| &lt;strong&gt;PA-2&lt;/strong&gt; | circumflex omega verbs with INF -εῖν / 3SG -εῖ
| &lt;strong&gt;PA-3&lt;/strong&gt; | circumflex omega verbs with INF -οῦν / 3SG -οῖ
| &lt;strong&gt;PA-4&lt;/strong&gt; | circumflex omega verbs with INF -ᾶν / 3SG -ᾷ
| &lt;strong&gt;PA-5&lt;/strong&gt; | ζάω + compounds&lt;/p&gt;
&lt;p&gt;It is clear that the imperfect endings shown above had a theme vowel (alternating ο/ε exactly as with the present) which historically contracted with the preceding vowel (if it existed) under exactly the same rules as with the present forms (explained in detail in &lt;a href=&#34;/2017/07/17/tour-greek-morphology-part-8/&#34;&gt;part 8&lt;/a&gt;).&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;theme vowel - ending&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ο - ν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ε - ς&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3SG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ε -&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ο - μεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ε - τε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3PL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ο - ν&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Too often with paradigms we only look at the person/number alternations within a fixed tense/aspect/voice. Let&#39;s now look at the possible present / imperfect alternations in the endings we&#39;ve seen (ignoring the augment for now):&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;&lt;th&gt;present&lt;/th&gt;&lt;th&gt;imperfect&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th rowspan=&#34;2&#34;&gt;1SG&lt;/th&gt;  &lt;td&gt;Xω&lt;/td&gt;       &lt;td&gt;Xον&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xῶ&lt;/td&gt;       &lt;td&gt;Xουν or Xων&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th rowspan=&#34;5&#34;&gt;2SG&lt;/th&gt;  &lt;td&gt;Xεις&lt;/td&gt;     &lt;td&gt;Xες&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xεῖς&lt;/td&gt;     &lt;td&gt;Xεις&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xοῖς&lt;/td&gt;     &lt;td&gt;Xους&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xᾷς&lt;/td&gt;      &lt;td&gt;Xᾱς&lt;/td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xῇς&lt;/td&gt;      &lt;td&gt;Xης&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th rowspan=&#34;5&#34;&gt;3SG&lt;/th&gt;  &lt;td&gt;Xει&lt;/td&gt;      &lt;td&gt;Xε(ν)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xεῖ&lt;/td&gt;      &lt;td&gt;Xει&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xοῖ&lt;/td&gt;      &lt;td&gt;Xου&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xᾷ&lt;/td&gt;       &lt;td&gt;Xᾱ&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xῇ&lt;/td&gt;       &lt;td&gt;Xη&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th rowspan=&#34;3&#34;&gt;3PL&lt;/th&gt;  &lt;td&gt;Xουσι(ν)&lt;/td&gt; &lt;td&gt;Xον&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xοῦσι(ν)&lt;/td&gt; &lt;td&gt;Xουν&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                          &lt;td&gt;Xῶσι(ν)&lt;/td&gt;  &lt;td&gt;Xων&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;With &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; endings identical between present and imperfect.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">The Normalisation Column in MorphGNT</title>
    <link href="https://jktauber.com/2018/07/23/normalisation-column-morphgnt/" rel="alternate" type="text/html" title="The Normalisation Column in MorphGNT"/>
    <published>2018-07-23</published>
    <updated>2018-07-23</updated>
    <id>https://jktauber.com/2018/07/23/normalisation-column-morphgnt</id>
    <content type="html" xml:base="https://jktauber.com/2018/07/23/normalisation-column-morphgnt/">&lt;p&gt;Eliran Wong asked for a more detailed description of the “normalisation” column in MorphGNT so I promised him I’d write a blog post about it.&lt;/p&gt;
&lt;p&gt;I first outlined the objective of the column in a &lt;a href=&#34;/2005/08/30/upcoming-new-morphgnt/&#34;&gt;2005 blog post&lt;/a&gt; but enough time has passed and new work done that I thought it was worthy of a new post.&lt;/p&gt;
&lt;p&gt;The core idea of the normalised column is to give the inflected form as it would be stated in isolation.&lt;/p&gt;
&lt;p&gt;To use the example from the 2005 post, consider the phrase in Matthew 1.20:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;τὴν γυναῖκά σου&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you were to ask someone what the accusative singular feminine definite article is, you&#39;d expect the answer τήν and not τὴν. Similarly if you asked what the accusative singular of γυνή is, you&#39;d expect the answer γυναῖκα and not γυναῖκά. The differences in Matthew 1.20 are contextual and, for many applications (particularly morphology) aren&#39;t of much interest.&lt;/p&gt;
&lt;p&gt;And so years ago, I went about adding a new column that normalised this sort of thing. Similarly μετά, μεθ&#39;, μετ&#39;, and μετὰ all get normalised to μετά in this separate column.&lt;/p&gt;
&lt;p&gt;Back in the 2005 post, I enumerated the normalisations as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;existing text may exhibit elision (e.g. μετ&#39; versus μετά)&lt;/li&gt;
&lt;li&gt;existing text may exhibit movable ς or ν&lt;/li&gt;
&lt;li&gt;final-acute may become grave&lt;/li&gt;
&lt;li&gt;enclitics may lose an accent&lt;/li&gt;
&lt;li&gt;word preceding an enclitic may gain an extra accent&lt;/li&gt;
&lt;li&gt;the οὐ / οὐκ / οὐχ alternation&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When I published the SBLGNT analysis, another normalisation was added, namely the normalisation of capitalisation at the start of paragraphs or direct speech. The capitalisation is not an inherent part of the inflected form in isolation, only the particular context of the token, and so it is normalised.&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/2017/04/17/analysing-verbs-nestle-1904/&#34;&gt;Analysing the Verbs in Nestle 1904&lt;/a&gt; I covered some differences between the SBLGNT and Nestle 1904 analyses that normalisation would have smoothed over. Note that normalisation COULD go further (for example, spelling differences) but I chose not to do that in the normalisation column.&lt;/p&gt;
&lt;p&gt;In brief, the things NOT normalised include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;spelling&lt;/li&gt;
&lt;li&gt;crasis (e.g. κἀγώ vs καὶ ἐγώ)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In &lt;a href=&#34;/2015/11/27/annotating-normalization-column-morphgnt-part-1/&#34;&gt;Annotating the Normalization Column in MorphGNT: Part 1&lt;/a&gt; I started talking about annotating WHY each token was normalised the way it was and you can see some counts there for how many tokens underwent normalisation of accent or capitalisation, and how many had elision or a movable nu or sigma.&lt;/p&gt;
&lt;p&gt;In many cases, the normalisation can be automated without any need for human intervention (by having a list of elidable words, enclitics, etc). I&#39;ll soon publish my latest Python code for doing this. In some cases, manual checking is needed (although lemmatisation generally resolves a lot of the ambiguities). In &lt;a href=&#34;/2016/01/17/direct-speech-capitalization-first-preceding-head/&#34;&gt;Direct Speech Capitalization and the First Preceding Head&lt;/a&gt; I talked about the start of some work to go through all capitalisation and identify the reason for it. Similarly &lt;a href=&#34;/2017/02/15/new-morphgnt-releases-and-accentuation-analysis/&#34;&gt;New MorphGNT Releases and Accentuation Analysis&lt;/a&gt; discusses work on annotating the reason for all accentuation changes.&lt;/p&gt;
&lt;p&gt;There is still lots more work to do this for the SBLGNT but I did apply the idea when working on Seumas Macdonald&#39;s &lt;a href=&#34;https://github.com/seumasjeltzz/DigitalNyssa&#34;&gt;Digital Nyssa&lt;/a&gt; project. For that, I produced a file the first five lines of which are:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;Ἦλθε                    ἦλθε                    capitalisation
καὶ                     καί                     grave
ἐφ’                     ἐπί                     elision
ἡμᾶς                    ἡμᾶς                    
ἡ                       ἡ                       proclitic
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here each token is normalised in the second column with the third column giving the reason for any difference between the token and the normalised form (and also indicating proclitics).&lt;/p&gt;
&lt;p&gt;The possible annotations (and there can be more than one on a token) are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;grave&lt;/li&gt;
&lt;li&gt;capitalisation&lt;/li&gt;
&lt;li&gt;elision&lt;/li&gt;
&lt;li&gt;movable&lt;/li&gt;
&lt;li&gt;extra&lt;/li&gt;
&lt;li&gt;proclitic&lt;/li&gt;
&lt;li&gt;enclitic&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I hope to eventually be able to provide the same for the entire SBLGNT (and other Greek texts).&lt;/p&gt;
&lt;p&gt;Doing all this normalisation has a number of benefits. It makes it easier to extract forms for studying morphology, it allows searches to work more as expected (you don&#39;t want to have to think up all the possible ways a form could actually be written in a text to search for it), it also allows much easier searching for particular phenomena (for example particular clitic accentuation).&lt;/p&gt;
&lt;p&gt;It also allows for more rigorous validation of things like accentuation. Work in this area has already uncovered a number of accentuation errors in the SBLGNT text, for example, and could help with automated checking of OCR, etc.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Eliran Wong asked for a more detailed description of the “normalisation” column in MorphGNT so I promised him I’d write a blog post about it.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 23</title>
    <link href="https://jktauber.com/2018/05/26/tour-greek-morphology-part-23/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 23"/>
    <published>2018-05-26</published>
    <updated>2018-05-26</updated>
    <id>https://jktauber.com/2018/05/26/tour-greek-morphology-part-23</id>
    <content type="html" xml:base="https://jktauber.com/2018/05/26/tour-greek-morphology-part-23/">&lt;p&gt;Part twenty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Okay, so we want to contrast two forms of the indicative generally referred to as the &#34;present&#34; and &#34;imperfect&#34;.&lt;/p&gt;
&lt;p&gt;As we always do with paradigms, we&#39;ll keep certain things constant (in this case, the lexeme, voice and mood) and vary things along along one axis (person / number agreement) and another axis (present vs imperfect).&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt; &lt;th&gt;present&lt;/th&gt; &lt;th&gt;imperfect&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1SG&lt;/th&gt;    &lt;td&gt;λύω&lt;/td&gt;     &lt;td&gt;ἔλυον&lt;/td&gt;    &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2SG&lt;/th&gt;    &lt;td&gt;λύεις&lt;/td&gt;   &lt;td&gt;ἔλυες&lt;/td&gt;    &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3SG&lt;/th&gt;    &lt;td&gt;λύει&lt;/td&gt;    &lt;td&gt;ἔλυε&lt;/td&gt;     &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1PL&lt;/th&gt;    &lt;td&gt;λύομεν&lt;/td&gt;  &lt;td&gt;ἐλύομεν&lt;/td&gt;  &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2PL&lt;/th&gt;    &lt;td&gt;λύετε&lt;/td&gt;   &lt;td&gt;ἐλύετε&lt;/td&gt;   &lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3PL&lt;/th&gt;    &lt;td&gt;λύουσι&lt;/td&gt;  &lt;td&gt;ἔλυον&lt;/td&gt;    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;There are numerous things which should stand out:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the &lt;strong&gt;imperfect&lt;/strong&gt; forms all have an initial ἐ-&lt;/li&gt;
&lt;li&gt;this is then followed by the same λυ root found in the &lt;strong&gt;present&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;this is then followed by an ε/ο &#34;theme&#34; vowel&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;1SG&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt; are identical in the &lt;strong&gt;imperfect&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;present&lt;/strong&gt; and &lt;strong&gt;imperfect&lt;/strong&gt; share the same ending in the &lt;strong&gt;1PL&lt;/strong&gt; and in the &lt;strong&gt;2PL&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There&#39;s another perhaps more subtle thing you may notice:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the endings in the &lt;strong&gt;imperfect&lt;/strong&gt; &lt;strong&gt;2SG&lt;/strong&gt; and &lt;strong&gt;3SG&lt;/strong&gt; are the same as the &lt;strong&gt;present&lt;/strong&gt; &lt;em&gt;without the ι&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Recall also that the -ουσι ending in the &lt;strong&gt;present&lt;/strong&gt; &lt;strong&gt;3PL&lt;/strong&gt; historically came from -οντι. Without the ι, that would be -οντ and given Greek words can only end in ν, ς, or a vowel, dropping the τ from -οντ would give us the -ον we see.&lt;/p&gt;
&lt;p&gt;Furthermore, if we consider the &lt;em&gt;athematic&lt;/em&gt; &lt;strong&gt;1SG&lt;/strong&gt; ending -μι and drop the ι, we get -μ. This is not one of the sounds a Greek word can end in and historically, this was changed to an ν. This gives us the -ον we see in the &lt;strong&gt;1SG&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;So it seems that &lt;em&gt;historically&lt;/em&gt; the relationship between the two sets of endings has to do with the existence or non-existence of an ι. The only exceptions are the &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt;. Interestingly these are the only two-syllable endings (counting the theme vowel).&lt;/p&gt;
&lt;p&gt;It could even be stated (at least in the earlier history) as: &lt;strong&gt;imperfect&lt;/strong&gt; has ἐ- but not -ι- and the &lt;strong&gt;present&lt;/strong&gt; has -ι- but not ἐ-, except in the two-syllable ending cases where the only contrast is the existence or absence of ἐ-.&lt;/p&gt;
&lt;p&gt;We&#39;ve only looked at λύω / ἔλυον so far, so in the next couple of posts we&#39;ll look to see how the imperfect endings work in other lexemes.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 22</title>
    <link href="https://jktauber.com/2018/05/16/tour-greek-morphology-part-22/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 22"/>
    <published>2018-05-16</published>
    <updated>2018-05-16</updated>
    <id>https://jktauber.com/2018/05/16/tour-greek-morphology-part-22</id>
    <content type="html" xml:base="https://jktauber.com/2018/05/16/tour-greek-morphology-part-22/">&lt;p&gt;Part twenty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;I’ve deliberated for a while about whether to follow the &lt;em&gt;present&lt;/em&gt; with the &lt;em&gt;imperfect&lt;/em&gt; or with the &lt;em&gt;aorist&lt;/em&gt;. I had recently elected to go with the aorist but as I sketched out what I wanted to say, I realised it would be easier if I’d said some things about the imperfect first.&lt;/p&gt;
&lt;p&gt;And so I’ve decided to do a few posts about the imperfect.&lt;/p&gt;
&lt;p&gt;We won’t talk about the endings in this post. I want us to &lt;em&gt;start&lt;/em&gt; thinking about the imperfect and its relationship to the present not in terms of endings but in terms of the overall paradigm structure.&lt;/p&gt;
&lt;p&gt;In previous posts, we saw that the present comes in two voices: an active and a middle (although we haven’t yet touched on the notion of presents coming in &lt;em&gt;both&lt;/em&gt; versus &lt;em&gt;just one&lt;/em&gt; of these). Within each voice, we looked at six indicative forms (corresponding to patterns of person and number agreement) and an infinitive (which effectively just has no person or number). We haven’t yet covered this, but each present voice also has imperative forms, subjunctive and optative forms, and participles in each of three genders.&lt;/p&gt;
&lt;p&gt;The imperfect, in contrast, only has the indicative forms. No infinitive, no participles, no imperative, no subjunctive, and no optative.&lt;/p&gt;
&lt;p&gt;We might be tempted to think of this in terms of the imperfect somehow being “defective”, as if we were doing a feature comparison like this:&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;&lt;th&gt;present&lt;/th&gt;&lt;th&gt;imperfect&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;indicatives&lt;/th&gt;  &lt;td&gt;✓&lt;/td&gt;&lt;td&gt;✓&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;infinitives&lt;/th&gt;  &lt;td&gt;✓&lt;/td&gt;&lt;td&gt;✗&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;imperatives&lt;/th&gt;  &lt;td&gt;✓&lt;/td&gt;&lt;td&gt;✗&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;subjunctives&lt;/th&gt; &lt;td&gt;✓&lt;/td&gt;&lt;td&gt;✗&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;optatives&lt;/th&gt;    &lt;td&gt;✓&lt;/td&gt;&lt;td&gt;✗&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;participles&lt;/th&gt;  &lt;td&gt;✓&lt;/td&gt;&lt;td&gt;✗&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;But another way to think of the imperfect as being &lt;em&gt;part&lt;/em&gt; of the “present” family and providing a contrasting set of indicatives.&lt;/p&gt;
&lt;p&gt;So we have:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;indicatives 1 (“present”)&lt;/li&gt;
&lt;li&gt;indicatives 2 (“imperfect”)&lt;/li&gt;
&lt;li&gt;infinitives&lt;/li&gt;
&lt;li&gt;imperatives&lt;/li&gt;
&lt;li&gt;subjunctives&lt;/li&gt;
&lt;li&gt;optatives&lt;/li&gt;
&lt;li&gt;participles&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This model suggests that, say, the infinitive or imperatives or participles, are just as much the infinitive, imperatives, or participles of the imperfect as they are of the present.&lt;/p&gt;
&lt;p&gt;This also leads to the need for a new name for this entire family. Traditionally it’s referred to as the “present system” because of the shared stems, but as I&#39;ve ranted on this blog before, I think it’s unfortunate to use “present” for both the entire system and for one of the two types of indicatives within it.&lt;/p&gt;
&lt;p&gt;For reasons we&#39;ll touch on later, the system could perhaps better be called the “imperfective system”.&lt;/p&gt;
&lt;p&gt;But the remainder of posts on the imperfects will focus on their endings and, in particular, the contrast with the other set of indicatives (the “present” indicatives we’ve been talking in about the previous posts).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">First Impressions of John Lee’s Accents Book</title>
    <link href="https://jktauber.com/2018/04/25/first-impressions-john-lees-accents-book/" rel="alternate" type="text/html" title="First Impressions of John Lee’s Accents Book"/>
    <published>2018-04-25</published>
    <updated>2018-04-25</updated>
    <id>https://jktauber.com/2018/04/25/first-impressions-john-lees-accents-book</id>
    <content type="html" xml:base="https://jktauber.com/2018/04/25/first-impressions-john-lees-accents-book/">&lt;p&gt;John Lee’s &lt;i&gt;Basics of Greek Accents&lt;/i&gt; was released today. Here are some first impressions.&lt;/p&gt;
&lt;p&gt;Like D. A. Carson’s 1985 book &lt;em&gt;Greek Accents: A Student’s Manual&lt;/em&gt;, Lee’s new book (based on notes from a class he taught at Macquarie University) is designed to backfill knowledge of Greek accents for those students whose beginning Greek skipped over them.&lt;/p&gt;
&lt;p&gt;At least since Wenham’s &lt;em&gt;Elements of New Testament Greek&lt;/em&gt;, there has been a trend in beginning New Testament Greek (and perhaps Classical Greek) textbooks to do away with instruction about accentuation. I haven’t investigated, but I suspect this correlates with a reduction in English-to-Greek exercises in textbooks too.&lt;/p&gt;
&lt;p&gt;Lee, like Carson before him, considers an understanding of accents to be vital to learning Greek. The book, published by Zondervan, is clearly (in name and cover design) intended by them to fill the gap left by Mounce’s &lt;em&gt;Basics of Biblical Greek&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Lee’s book is small—110 pages and about the size of a 5 x 7 photograph. It’s compact but lucid nevertheless. The modern typography makes for more pleasant reading that both Carson book and Probert’s 2003 &lt;em&gt;New Short Guide to the Accentuation of Ancient Greek&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;It’s a gentler introduction than either Carson or Probert. There are eight chapters or &#34;lessons&#34; and each has two sets of exercises (marked as &#34;In Class&#34; and &#34;Homework&#34;). All exercises involve adding accents to unaccented text. Examples and exercises are NT focused but not exclusively and the book would be more than suitable for Classical Greek students as well.&lt;/p&gt;
&lt;p&gt;As is understandable given its goals, there are no theoretical underpinnings given and little historical explanation.&lt;/p&gt;
&lt;p&gt;I’ve found a few places where, given it’s for beginners (albeit those who know some Greek), I wish Lee had been a little more explicit. For example he says that &#34;Aorist active infinitives in -σαι accent on second last&#34; but never explains when one might expect an acute versus a circumflex. A one line rule with several examples is typical. But it is rare that all the edge cases are covered.&lt;/p&gt;
&lt;p&gt;After saying that the verb is generally recessive, he gives various forms of λύω including the subjunctive λυθῶ. He gives contraction as the reason for this one deviant form, but that is the last thing he says about subjunctives other than a remark a couple of pages later about ἀποδῷ being the pattern for compound -μι verbs.&lt;/p&gt;
&lt;p&gt;While Lee is a gentler introduction, one thing I like about Carson’s book on accents is he’ll often be a little more exploratory, considering a new form and whether previous rules are adequate to cover the evidence, and only once motivated, introduce a new rule. In doing this, students are encouraged to think a little more about how the rules interact. In a way, Carson’s approach is more like what I’ve been trying to do with my morphology blog posts.&lt;/p&gt;
&lt;p&gt;While there’s much to commend it as a first introduction to accents, I do find Lee often misses the forest and instead just catalogs the trees. There’s little view of the whole as a system, how the parts interact. I understand why you don’t start with that, but I feel you need to get to it eventually.&lt;/p&gt;
&lt;p&gt;As an example, I recently summarised the first and second declension noun accents as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;by default the accent is persistent&lt;/li&gt;
&lt;li&gt;however, if the ending is a different length than in the base form (nominative singular), the law of limitation may require an accent change (e.g. X́XS -&amp;gt; XX́L, L̃S -&amp;gt; ĹL, ĹL -&amp;gt; L̃S)&lt;/li&gt;
&lt;li&gt;if the base form is oxytone, it becomes perispomenon (X́-&amp;gt;L̃) in oblique cases (genitive and dative)&lt;/li&gt;
&lt;li&gt;in the 1st declension, the genitive plural is always perispomenon -ῶν (even if the base is not oxytone)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I gave examples of contrasting pairs for every accentuation and syllable length combination in both the first and second declension, and highlighted various things like the importance of building an intuition for the L̃S ~ ĹL alternation (the σωτῆρα rule). I also pointed out that the oblique case perispomenon (XL̃) is only possible because all oblique case endings are long.&lt;/p&gt;
&lt;p&gt;Now, I’m not suggesting that this is sufficient—it needs a certain amount of unpacking and is jargon heavy. But this, or something similar, makes a nice summary that ties multiple things together in explaining the first and second declension. It covers the fact that persistence and the law of limitation might be in conflict and how that gets resolved. It explains what happens to oxytones in the oblique cases, and gives the exception of 1st declension genitive plural, pointing out this is not limited just to the oxytones like the previous rule.&lt;/p&gt;
&lt;p&gt;In contrast, Lee covers the relevant rules but never brings them together in the context of a single paradigm (other than θεός which hardly demonstrates most of the points). The statement about the genitive plural is 28 pages later than the statement about circumflexes in the oblique when the base form is oxytone. His examples of the law of limitation do cover a couple of direct~oblique alternations but that is isolated from the chapter on noun accentuation and is never explained in the context of vowel length patterns in the noun endings.&lt;/p&gt;
&lt;p&gt;All in all, however, I think Lee’s book is a good first introduction to Greek accentuation and its presentation is undoubtedly cleaner than that of previous books. My main criticism is that it is incomplete and students would benefit from some consolidation of the principles taught. Some of that criticism may be mitigated in a classroom situation, for which it was originally intended. Students working alone might have more questions than the book answers. I would recommend something like Probert as a follow on (it will also make a better reference). That said, I think Lee achieves his aim in providing the &#34;basics&#34; and (to quote the back cover blurb) &#34;a foundation [students] will use as they continue their studies&#34;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">John Lee’s &lt;i&gt;Basics of Greek Accents&lt;/i&gt; was released today. Here are some first impressions.</summary>
  </entry><entry>
    <title type="html">Conference Time</title>
    <link href="https://jktauber.com/2018/03/18/conference-time/" rel="alternate" type="text/html" title="Conference Time"/>
    <published>2018-03-18</published>
    <updated>2018-03-18</updated>
    <id>https://jktauber.com/2018/03/18/conference-time</id>
    <content type="html" xml:base="https://jktauber.com/2018/03/18/conference-time/">&lt;p&gt;I’m off for another string of conferences, this time in Copenhagen, Chicago, and New Orleans.&lt;/p&gt;
&lt;p&gt;First is a workshop on &lt;em&gt;Original Language Resources for Bible Translation and Education&lt;/em&gt; organised by Nicolai Winther-Nielsen of the Global Learning Initiative and Reinier de Blois of the United Bible Societies. David Instone-Brewer put it best when he responded to the workshop invitation with &#34;All the key people in one place with lots of time to talk and plan. How could I miss this?&#34; Perhaps most exciting for me is I finally get to meet Ulrik Sandborg-Petersen for the first time after working together for more than twelve years!&lt;/p&gt;
&lt;p&gt;I fly from Copenhagen to Chicago at the end of the week for the annual conference of the American Association of Applied Linguistics. It will be my first time attending the conference and I&#39;m looking forward to learning a lot (although in contrast to the Copenhagen workshop, I&#39;ll know virtually no one).&lt;/p&gt;
&lt;p&gt;I have to leave AAAL slightly early though, to go down to New Orleans for the first US VueConf. Vue.JS is an important technology in the Scaife Viewer and DeepReader reading environments. I went to the first European VueConf last year and gave a lightning talk on DeepReader. I had hoped to give a talk on the Scaife Viewer at VueConf US but my talk wasn&#39;t accepted so I&#39;m hoping at least for another lightning talk.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I’m off for another string of conferences, this time in Copenhagen, Chicago, and New Orleans.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 21</title>
    <link href="https://jktauber.com/2018/03/10/tour-greek-morphology-part-21/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 21"/>
    <published>2018-03-10</published>
    <updated>2018-03-10</updated>
    <id>https://jktauber.com/2018/03/10/tour-greek-morphology-part-21</id>
    <content type="html" xml:base="https://jktauber.com/2018/03/10/tour-greek-morphology-part-21/">&lt;p&gt;Part twenty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;I started this series with&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I ultimately hope to cover everything that a beginner-intermediate grammar might but &lt;strong&gt;in a much more exploratory fashion&lt;/strong&gt;. I’ll occasionally touch on morphological theory but I mostly want to point out phenomena in the language that students &lt;strong&gt;have already seen&lt;/strong&gt; but perhaps have not thought about in any depth.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(emphasis added)&lt;/p&gt;
&lt;p&gt;In short, the primary goal has been (and will continue to be) to take data the reader already is assumed to know and to make observations and construct relationships that the reader perhaps didn’t already realise or know. The secondary goal is to talk a little bit about linguistic theory and historical linguistics in relation to the specific phenomena being discussed.&lt;/p&gt;
&lt;p&gt;Now that we’re finished our first pass over (particularly the endings of) the present indicatives and infinitives, I wanted to summarise a few key points we’ve touched on that are of a more conceptual nature.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A paradigm is a way of showing related forms next to one another for comparison. We often keep some morphosyntactic properties constant while varying others. We often but, not always, keep the lexeme constant.&lt;/li&gt;
&lt;li&gt;We can look at paradigms along (at least) three dimensions: (1) we can take one lexeme’s inflection and look at what stays the same and what changes in different cells; (2) we can take a morphosyntactic property set and look at what stays the same and what changes across different lexemes; (3) we can take a &lt;em&gt;subset&lt;/em&gt; of morphosyntactic properties and vary them while keeping the rest of the set (and the lexeme) fixed.&lt;/li&gt;
&lt;li&gt;Greek rarely has a one-to-one mapping between an individual morphosyntactic property and some surface property of the inflected form.&lt;/li&gt;
&lt;li&gt;There are some cells in a paradigm that are highly predictable and others than are highly predictive.&lt;/li&gt;
&lt;li&gt;There are relationships between cells which are often more helpful than relationships between a cell and its underlying or historical stem.&lt;/li&gt;
&lt;li&gt;The primary role of morphology is to discriminate between alternatives, not build up compositional meaning.&lt;/li&gt;
&lt;li&gt;Ambiguity in morphology can be tolerated if other things (syntax, context) help disambiguate.&lt;/li&gt;
&lt;li&gt;There is a big difference between looking at patterns in the surface forms and exploring the historical reasons those patterns developed. While the latter is vital for answering “why”, it is not a crucial part of language acquisition. (Native English speakers don’t acquire strong verbs by understanding how Proto-Indo-European ablaut patterns led to Germanic inflectional classes!)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As well as these conceptual points, we’ve talked about the actual endings, inflectional classes, vowel contractions, frequency effects, and which cells might be the best to use as a lemma.&lt;/p&gt;
&lt;p&gt;We also spent time actually testing our models against the corpus data with some Python scripts and showed how that uncovered some patterns we hadn’t previously considered.&lt;/p&gt;
&lt;p&gt;We haven’t looked at everything to do with the presents, but it’s time to move on, at least for a while, to a different part of the verbal system.&lt;/p&gt;
&lt;p&gt;That said, if you have any questions about the previous twenty parts, or any questions you&#39;re hoping will be answered in subsequent posts, just leave a comment (or email me if you want to ask anonymously).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 20</title>
    <link href="https://jktauber.com/2018/03/05/tour-greek-morphology-part-20/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 20"/>
    <published>2018-03-05</published>
    <updated>2018-03-05</updated>
    <id>https://jktauber.com/2018/03/05/tour-greek-morphology-part-20</id>
    <content type="html" xml:base="https://jktauber.com/2018/03/05/tour-greek-morphology-part-20/">&lt;p&gt;Part twenty of a tour through Greek inflectional morphology to help get
students thinking more systematically about the word forms they see (and maybe
teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/2017/10/16/tour-greek-morphology-part-17/&#34;&gt;part 17&lt;/a&gt;,
we went through counts for our present active (infinitive and indicative) classes. Now we&#39;ll wrap things up by doing the same for the middle.&lt;/p&gt;
&lt;p&gt;Recall this is based on the analysis of 820 tokens available
&lt;a href=&#34;https://gist.github.com/jtauber/accb8180f56fceee37f57a040faa4b8a&#34;&gt;here&lt;/a&gt;
which was described in the last two parts.&lt;/p&gt;
&lt;p&gt;Let us first of all look at the number of distinct lemmas in each of our 14 classes.&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-1&lt;/th&gt;        &lt;td&gt;barytone thematics with INF -εσθαι / 3SG -εται&lt;/td&gt;                               &lt;td&gt;105&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-2&lt;/th&gt;        &lt;td&gt;circumflex thematics with INF -εῖσθαι / 3SG -εῖται&lt;/td&gt;                                               &lt;td&gt;21&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-3&lt;/th&gt;        &lt;td&gt;circumflex thematics with INF -οῦσθαι / 3SG -οῦται (ζηλόω, ἐλαττόω, λυτρόομαι, διαβεβαιόομαι)&lt;/td&gt;    &lt;td&gt;4&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-4&lt;/th&gt;        &lt;td&gt;circumflex thematics with INF -ᾶσθαι / 3SG -ᾶται&lt;/td&gt;                                                 &lt;td&gt;11&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-5&lt;/th&gt;        &lt;td&gt;circumflex thematics with INF -ῆσθαι / 3SG -ῆται   (χράομαι and compound)&lt;/td&gt;                        &lt;td&gt;2&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-6a&lt;/th&gt;       &lt;td&gt;INF -υσθαι / 3SG -υται                             (ἀπόλλυμι, ἐνδείκνυμι, συναναμίγνυμι)&lt;/td&gt;         &lt;td&gt;3&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-7&lt;/th&gt;        &lt;td&gt;INF -εσθαι / 3SG -εται                             (compound of τίθημι)&lt;/td&gt;                          &lt;td&gt;3&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-8&lt;/th&gt;        &lt;td&gt;INF -οσθαι / 3SG -οται&lt;/td&gt;                                                                           &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-9&lt;/th&gt;        &lt;td&gt;INF -ασθαι / 3SG -αται                             (δύναμαι, compounds of ἵστημι)&lt;/td&gt;                &lt;td&gt;8&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-10&lt;/th&gt;       &lt;td&gt;ἧμαι&lt;/td&gt;                                                                                             &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-10-COMP&lt;/th&gt;  &lt;td&gt;compounds of ἧμαι                                  (κάθημαι)&lt;/td&gt;                                     &lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-11&lt;/th&gt;       &lt;td&gt;κεῖμαι&lt;/td&gt;                                                                                           &lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-11-COMP&lt;/th&gt;  &lt;td&gt;compounds of κεῖμαι&lt;/td&gt;                                                                              &lt;td&gt;7&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-12&lt;/th&gt;       &lt;td&gt;οἶμαι&lt;/td&gt;                                                                                            &lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Again, even the small counts are elevated due to compound verbs. Folding
compounds of the same base verb, only &lt;strong&gt;PM-1&lt;/strong&gt;, &lt;strong&gt;PM-2&lt;/strong&gt;, &lt;strong&gt;PM-3&lt;/strong&gt;, &lt;strong&gt;PM-4&lt;/strong&gt;,
and &lt;strong&gt;PM-6a&lt;/strong&gt; have more than one or two members (and &lt;strong&gt;PM-6a&lt;/strong&gt; only has three).&lt;/p&gt;
&lt;p&gt;This is just looking at the number of unique lemmas in each class but there are
two other sets of numbers that are worth looking at:
(1) the total number of tokens in the SBLGNT;
(2) the distribution of classes amongst the hapax legomena.&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;class&lt;/th&gt;              &lt;th&gt;lemmas&lt;/th&gt;   &lt;th&gt;tokens&lt;/th&gt;  &lt;th&gt;hapax&lt;/th&gt;   &lt;th&gt;hapax details&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-1&lt;/th&gt;        &lt;td&gt;105&lt;/td&gt;      &lt;td&gt;523&lt;/td&gt;     &lt;td&gt;45&lt;/td&gt;      &lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-2&lt;/th&gt;        &lt;td&gt;21&lt;/td&gt;       &lt;td&gt;57&lt;/td&gt;      &lt;td&gt;7&lt;/td&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-3&lt;/th&gt;        &lt;td&gt;4&lt;/td&gt;        &lt;td&gt;5&lt;/td&gt;       &lt;td&gt;3&lt;/td&gt;       &lt;td&gt;ζηλόω ἐλαττόω λυτρόομαι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-4&lt;/th&gt;        &lt;td&gt;11&lt;/td&gt;       &lt;td&gt;33&lt;/td&gt;      &lt;td&gt;4&lt;/td&gt;       &lt;td&gt;μυκάομαι κοιμάομαι καταράομαι ἐγκαυχάομαι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-5&lt;/th&gt;        &lt;td&gt;2&lt;/td&gt;        &lt;td&gt;2&lt;/td&gt;       &lt;td&gt;2&lt;/td&gt;       &lt;td&gt;χράομαι and συγχράομαι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-6a&lt;/th&gt;       &lt;td&gt;3&lt;/td&gt;        &lt;td&gt;9&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-7&lt;/th&gt;        &lt;td&gt;3&lt;/td&gt;        &lt;td&gt;5&lt;/td&gt;       &lt;td&gt;2&lt;/td&gt;       &lt;td&gt;διατίθεμαι and μετατίθημι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-8&lt;/th&gt;        &lt;td&gt;-&lt;/td&gt;        &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-9&lt;/th&gt;        &lt;td&gt;8&lt;/td&gt;        &lt;td&gt;156&lt;/td&gt;     &lt;td&gt;4&lt;/td&gt;       &lt;td&gt;ἐξίστημι ἐφίστημι ἀνθίστημι ἀφίσταμαι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-10&lt;/th&gt;       &lt;td&gt;-&lt;/td&gt;        &lt;td&gt;- &lt;/td&gt;      &lt;td&gt;-&lt;/td&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-10-COMP&lt;/th&gt;  &lt;td&gt;1&lt;/td&gt;        &lt;td&gt;5&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-11&lt;/th&gt;       &lt;td&gt;1&lt;/td&gt;        &lt;td&gt;9&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-11-COMP&lt;/th&gt;  &lt;td&gt;7&lt;/td&gt;        &lt;td&gt;15&lt;/td&gt;      &lt;td&gt;-&lt;/td&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PM-12&lt;/th&gt;       &lt;td&gt;1&lt;/td&gt;        &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;οἶμαι&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Recall the hapax legomena matter because they give an indication of what
classes were still productive.&lt;/p&gt;
&lt;p&gt;If we fold compounds under their base verb, only &lt;strong&gt;PM-1&lt;/strong&gt;, &lt;strong&gt;PM-2&lt;/strong&gt;, &lt;strong&gt;PM-3&lt;/strong&gt;,
and &lt;strong&gt;PM-4&lt;/strong&gt; have more than one hapax legomenon.&lt;/p&gt;
&lt;p&gt;Let&#39;s now look at counts for each paradigm cell for each class:&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;  &lt;th nowrap&gt;PM-1&lt;/th&gt; &lt;th nowrap&gt;PM-2&lt;/th&gt; &lt;th nowrap&gt;PM-3&lt;/th&gt; &lt;th nowrap&gt;PM-4&lt;/th&gt; &lt;th nowrap&gt;PM-5&lt;/th&gt; &lt;th nowrap&gt;PM-6a&lt;/th&gt;  &lt;th nowrap&gt;PM-7&lt;/th&gt; &lt;th nowrap&gt;PM-8&lt;/th&gt; &lt;th nowrap&gt;PM-9&lt;/th&gt; &lt;th nowrap&gt;PM-10-C&lt;/th&gt;  &lt;th nowrap&gt;PM-11&lt;/th&gt;  &lt;th nowrap&gt;PM-11-C&lt;/th&gt;  &lt;th nowrap&gt;PM-12&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;INF&lt;/th&gt; &lt;td&gt;89&lt;/td&gt; &lt;td&gt;15&lt;/td&gt; &lt;td&gt;4&lt;/td&gt; &lt;td&gt;8&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;4&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;12&lt;/td&gt; &lt;td&gt;2&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;3&lt;/td&gt; &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1SG&lt;/th&gt; &lt;td&gt;85&lt;/td&gt; &lt;td&gt;17&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;3&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;4&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;9&lt;/td&gt;  &lt;td&gt;1&lt;/td&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2SG&lt;/th&gt; &lt;td&gt;19&lt;/td&gt; &lt;td&gt;1&lt;/td&gt;  &lt;td&gt;-&lt;/td&gt; &lt;td&gt;5&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;7&lt;/td&gt;  &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3SG&lt;/th&gt; &lt;td&gt;228&lt;/td&gt;&lt;td&gt;7&lt;/td&gt;  &lt;td&gt;-&lt;/td&gt; &lt;td&gt;8&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;74&lt;/td&gt; &lt;td&gt;2&lt;/td&gt; &lt;td&gt;7&lt;/td&gt; &lt;td&gt;11&lt;/td&gt;&lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1PL&lt;/th&gt; &lt;td&gt;20&lt;/td&gt; &lt;td&gt;4&lt;/td&gt;  &lt;td&gt;-&lt;/td&gt; &lt;td&gt;3&lt;/td&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;3&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;9&lt;/td&gt;  &lt;td&gt;-&lt;/td&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2PL&lt;/th&gt; &lt;td&gt;24&lt;/td&gt; &lt;td&gt;9&lt;/td&gt;  &lt;td&gt;-&lt;/td&gt; &lt;td&gt;3&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;32&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3PL&lt;/th&gt; &lt;td&gt;58&lt;/td&gt; &lt;td&gt;4&lt;/td&gt;  &lt;td&gt;1&lt;/td&gt; &lt;td&gt;3&lt;/td&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;13&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;-&lt;/td&gt; &lt;td&gt;1&lt;/td&gt; &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;&lt;th&gt;523&lt;/th&gt;&lt;th&gt;57&lt;/th&gt;&lt;th&gt;5&lt;/th&gt;&lt;th&gt;33&lt;/th&gt;&lt;th&gt;2&lt;/th&gt; &lt;th&gt;9&lt;/th&gt; &lt;th&gt;5&lt;/th&gt; &lt;th&gt;-&lt;/th&gt; &lt;th&gt;156&lt;/th&gt;&lt;th&gt;5&lt;/th&gt; &lt;th&gt;9&lt;/th&gt; &lt;th&gt;15&lt;/th&gt;&lt;th&gt;1&lt;/th&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;As in the active, the &lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;INF&lt;/strong&gt; dominate with only a few interesting
exceptions. The third person (especially &lt;strong&gt;3SG&lt;/strong&gt; but also &lt;strong&gt;3PL&lt;/strong&gt;) is unusually low in
&lt;strong&gt;PM-2&lt;/strong&gt;. In &lt;strong&gt;PM-9&lt;/strong&gt;, the &lt;strong&gt;2PL&lt;/strong&gt; is usually high. This is almost certainly just
because of particular lexical items that happen to be in those classes rather than
an inherent characteristic of the class itself, although because the origins
of some classes are derivational, there may occasionally be tendencies on
semantic grounds.&lt;/p&gt;
&lt;p&gt;If the goal is just to identify the person/number, not the class,
(which is true in reception but not learning) then most of these numbers
collapse because of shared endings. Here are the counts just focused on the
common endings (without accents):&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;td&gt;INF&lt;/td&gt;             &lt;td&gt;-σθαι&lt;/td&gt;       &lt;td&gt;137&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;1SG&lt;/td&gt;             &lt;td&gt;-μαι&lt;/td&gt;        &lt;td&gt;122&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td rowspan=&#34;2&#34;&gt;2SG&lt;/td&gt; &lt;td&gt;-{ι}&lt;/td&gt;        &lt;td&gt;25&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                         &lt;td&gt;-σαι&lt;/td&gt;        &lt;td&gt;7&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;3SG&lt;/td&gt;             &lt;td&gt;-ται&lt;/td&gt;        &lt;td&gt;337&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;1PL&lt;/td&gt;             &lt;td&gt;-μεθα&lt;/td&gt;       &lt;td&gt;41&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;2PL&lt;/td&gt;             &lt;td&gt;-σθε&lt;/td&gt;        &lt;td&gt;69&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;3PL&lt;/td&gt;             &lt;td&gt;-νται&lt;/td&gt;       &lt;td&gt;82&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;And that&#39;s it for the present middles. I&#39;ll do a brief summary post next and
then we&#39;ll start exploring beyond the presents.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twenty of a tour through Greek inflectional morphology to help get
students thinking more systematically about the word forms they see (and maybe
teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">New Draft Morphological Tags for MorphGNT</title>
    <link href="https://jktauber.com/2018/02/03/new-draft-morphological-tags-morphgnt/" rel="alternate" type="text/html" title="New Draft Morphological Tags for MorphGNT"/>
    <published>2018-02-03</published>
    <updated>2018-02-03</updated>
    <id>https://jktauber.com/2018/02/03/new-draft-morphological-tags-morphgnt</id>
    <content type="html" xml:base="https://jktauber.com/2018/02/03/new-draft-morphological-tags-morphgnt/">&lt;p&gt;I’ve finally done the work in translating the MorphGNT tagging system to a new proposal for initial feedback.&lt;/p&gt;
&lt;p&gt;At least going back to my initial collaboration with Ulrik Sandborg-Petersen in 2005, I&#39;ve been thinking about how I would do morphological tags in MorphGNT if I were starting from scratch.&lt;/p&gt;
&lt;p&gt;Much later, in 2014, I had some discussions with Mike Aubrey at my first SBL conference and put together a &lt;a href=&#34;https://github.com/morphgnt/sblgnt/wiki/Proposal-for-a-New-Tagging-Scheme&#34;&gt;straw proposal&lt;/a&gt;. There was a rethinking of some parts-of-speech, handling of tense/aspect, handling of voice, handling of syncretism and underspecification.&lt;/p&gt;
&lt;p&gt;Even though some of the ideas were more drastic than others, a few things have remained consistent in my thinking:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;there is value in a purely morphological analysis that doesn&#39;t disambiguate on syntactic or semantic grounds&lt;/li&gt;
&lt;li&gt;this analysis does not need the notion of parts-of-speech beyond purely &lt;a href=&#34;/2015/11/05/morphological-parts-speech-greek/&#34;&gt;Morphological Parts of Speech&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;this analysis should not attempt to distinguish middles and passives in the present or perfect system&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As part of the handling of syncretism and underspecification, I had originally suggested a need for a value for the case property that didn&#39;t distinguish nominative and accusative and a need for a value for the gender property like &#34;non-neuter&#34;.&lt;/p&gt;
&lt;p&gt;In the absence of feedback beyond a vague feeling that something &lt;em&gt;like&lt;/em&gt; this should be done, I didn&#39;t immediately make further progress but, a year later, started gathering more notes on &lt;a href=&#34;https://github.com/morphgnt/sblgnt/wiki/Handling-Ambiguity&#34;&gt;handling ambiguity&lt;/a&gt;. That then led to a more concrete proposal just around &lt;a href=&#34;https://github.com/morphgnt/sblgnt/wiki/Proposal-for-Gender-Tagging&#34;&gt;gender&lt;/a&gt; and &lt;a href=&#34;https://github.com/morphgnt/sblgnt/wiki/Proposal-for-Case-Tagging&#34;&gt;case&lt;/a&gt; (although not without open questions).&lt;/p&gt;
&lt;p&gt;I&#39;ve now implemented those smaller-scale proposals as a first draft for the MorphGNT SBLGNT and plan to apply them to other GNT texts soon. The &lt;code&gt;new-tags&lt;/code&gt; branch for MorphGNT SBLGNT is available at: &lt;a href=&#34;https://github.com/morphgnt/sblgnt/tree/new-tags&#34;&gt;https://github.com/morphgnt/sblgnt/tree/new-tags&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This adds a new column (the intention is not to replace existing analyses yet, just augment them) that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;makes voice formal not functional (while still using &lt;code&gt;P&lt;/code&gt; in the aorist and future for what Carl Conrad would called MP2)&lt;/li&gt;
&lt;li&gt;does not give morphosyntactic properties for uninflected words&lt;/li&gt;
&lt;li&gt;implements basic nominative/accusative case syncretism in the neuter with a single value&lt;/li&gt;
&lt;li&gt;implements basic non-neuter, non-feminine, and (in most genitive plurals) complete gender syncretism with a value for each&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One immediate affect of this is that a list I have from Randall Tan of disagreements between the MorphGNT SBLGNT analysis and that of the Nestle 1904 largely goes away because many of them were merely different judgements of gender or case on non-morphological grounds. This new tag retains the uncertainty. Another benefit of the tagging scheme is that it provides a reasonable output for an automated morphological analysis system which can then, in a separate step, be disambiguated syntactically (or semantically), potentially with human input.&lt;/p&gt;
&lt;p&gt;There are some important things to note, however, as just saying &#34;this is a purely morphological analysis that doesn&#39;t disambiguate&#34; oversimplifies things greatly.&lt;/p&gt;
&lt;p&gt;Firstly, while punting distributional and semantic part-of-speech questions like &#34;is this an adverb or a conjunction&#34; or &#34;what type of pronoun is this&#34; is extremely helpful, there are still some questions that impact a purely morphological tagging such as whether to represent a fossilised verb acting as a particle as having morphological inflection.&lt;/p&gt;
&lt;p&gt;Secondly, there are what I have called &lt;strong&gt;extended syncretisms&lt;/strong&gt; not modelled where there can be uncertainty between properties taken as a pair. For example 1st person singular vs 3rd person plural in -ον, or 1st declension genitive singular vs accusative plural in -ας. It may be worth still conveying this ambiguity but just through disjunction, saying for example that a word is &lt;code&gt;GSF^APF&lt;/code&gt;. These are almost always phonological coincidences rather than structural syncretism and so should be modelled differently.&lt;/p&gt;
&lt;p&gt;Related to this is the &#34;double&#34; syncretism between accusative singular masculine and neuter on the one hand and nominative and accusative singular neuter on the other hand. If we model the latter as &lt;code&gt;CSN&lt;/code&gt; then we&#39;ve lost the former (which, if by itself could be modelled as &lt;code&gt;ASY&lt;/code&gt;). So, in a sense &lt;code&gt;CSN&lt;/code&gt; and &lt;code&gt;ASY&lt;/code&gt; are syncretic (but also share an overlapping cell). &lt;code&gt;CSN^ASY&lt;/code&gt; doesn&#39;t quite seem right because of that overlap and the fact that this isn&#39;t just a phonological coincidence as best I know.&lt;/p&gt;
&lt;p&gt;Thirdly, I have only modelled basic syncretism, not endings in wildly different parts of the paradigms (so would definitely not be called syncretism) that also happen to have converged by phonological change. For example both -ου and -ον can be nominal endings or unrelated verbal endings (with quite a few interpretations, mind you, especially for -ου). No attempt has been made to capture this in a single tag (although a disjunctive representation might be possible).&lt;/p&gt;
&lt;p&gt;And finally (although related to the previous point), a certain amount of &lt;em&gt;lexical&lt;/em&gt; disambiguation is applied. There are many cases where not being familiar with the lexeme makes a form highly ambiguous but that ambiguity goes away if the lemma is known. A simple example is imperfects versus second aorists where the principal parts resolve the ambiguity. The draft new tags for MorphGNT SBLGNT effectively assume the lemmatisation has been done and is correct.&lt;/p&gt;
&lt;p&gt;In light of this, some people might be surprised, therefore, that υἱοῦ is tagged &lt;code&gt;GSY&lt;/code&gt; and not &lt;code&gt;GSM&lt;/code&gt; given it&#39;s lexically masculine. My current argument (at least in my own head) is that, regardless of a specific lexeme like υἱοῦ, &lt;code&gt;GSM&lt;/code&gt;, as a morphological tag, doesn&#39;t really make sense in the Greek paradigmatic system because, by nature, genitive singulars have the same form in the non-feminines. I think there&#39;s definitely a difference, if subtle, between true ambiguity and underspecification. It&#39;s not that υἱοῦ is ambiguous as to gender, it&#39;s just that the cell doesn&#39;t distinguish masculine from neuter. Lexical knowledge is still being used, otherwise it could be feminine (or even a middle imperative!).&lt;/p&gt;
&lt;p&gt;So, in short, syncretism inherent to the paradigmatic system is captured well but other forms of ambiguity will need to be handled other ways (potentially via a disjunctive list of possibilities). This seems a reasonable, practical compromise.&lt;/p&gt;
&lt;p&gt;Let me know your thoughts. There&#39;s definitely still more to do and I do plan on expressing more ambiguity with some form of disjunction. I&#39;ll probably do a post soon with some more thoughts (and stats) on that.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I’ve finally done the work in translating the MorphGNT tagging system to a new proposal for initial feedback.</summary>
  </entry><entry>
    <title type="html">Lexical Dispersion in the Greek New Testament Via Gries’s DP</title>
    <link href="https://jktauber.com/2018/01/21/lexical-dispersion-greek-new-testament-gries-dp/" rel="alternate" type="text/html" title="Lexical Dispersion in the Greek New Testament Via Gries’s DP"/>
    <published>2018-01-21</published>
    <updated>2018-01-21</updated>
    <id>https://jktauber.com/2018/01/21/lexical-dispersion-greek-new-testament-gries-dp</id>
    <content type="html" xml:base="https://jktauber.com/2018/01/21/lexical-dispersion-greek-new-testament-gries-dp/">&lt;p&gt;Measures of dispersion are interesting to apply to a corpus because they tell you whether a word is distributed across parts of the corpus as expected or concentrated more in just some parts. I thought I’d play around with Gries’s DP as a measure of dispersion on the SBLGNT lemmas.&lt;/p&gt;
&lt;p&gt;There are lots of measures of dispersion but Stefan Th. Gries&#39;s is perhaps the simplest (see [1] for a detailed survey of lots of different measures as well as the original definition of his own).&lt;/p&gt;
&lt;p&gt;Here it is in Python for lemmas:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;dp = sum(abs((p[part] / t) - (lp[lemma][part] / l[lemma])) for part in p) / 2
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;p[part]&lt;/code&gt; is a dictionary mapping corpus part to the count of words in that part&lt;/li&gt;
&lt;li&gt;&lt;code&gt;l[lemma]&lt;/code&gt; is a dictionary mapping lemmas to the count of that lemma in the corpus&lt;/li&gt;
&lt;li&gt;&lt;code&gt;lp[lemma][part]&lt;/code&gt; is a dictionary of dictionaries mapping lemmas and parts to the count of the lemma in that part&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;but see [1] for some simple worked examples.&lt;/p&gt;
&lt;p&gt;One thing Gries doesn&#39;t talk about (email me if you know of any discussion of this) is how to handle very low frequency words as they&#39;ll dominate the high DP values.&lt;/p&gt;
&lt;p&gt;Using &lt;strong&gt;books&lt;/strong&gt; as the parts, here are the top 10 most evenly dispersed lemmas in the GNT:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.0466 ὁ
0.1085 εἰς
0.1154 καί
0.1178 ὅς
0.1250 εἰμί
0.1358 ποιέω
0.1382 γίνομαι
0.1385 πολύς
0.1395 μετά
0.1420 μή
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here are the top 10 least evenly dispered lemmas (including all frequencies, even hapax legomena):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.9984 φιλοπρωτεύω
0.9984 ἐπιδέχομαι
0.9984 μειζότερος
0.9984 Διοτρέφης
0.9984 φλυαρέω
0.9982 χάρτης
0.9982 κυρία
0.9976 προσοφείλω
0.9976 ἑκούσιος
0.9976 ἄχρηστος
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;but this list looks very different if we, say, restrict ourselves to lemmas that occur 5 times or more:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.9827 ἀντίχριστος
0.9752 καταλαλέω
0.9687 ἐπιφάνεια
0.9681 νήφω
0.9680 ἀρετή
0.9667 μῦθος
0.9641 Μελχισέδεκ
0.9568 πλεονεκτέω
0.9557 νόημα
0.9532 ἐνέργεια
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;or 30 times or more:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.8952 ἀρνίον
0.8085 καυχάομαι
0.8024 θηρίον
0.7987 μέλος
0.7969 εἴτε
0.7266 συνείδησις
0.7202 περιτομή
0.7199 θρόνος
0.7139 ὑποτάσσω
0.7116 Παῦλος
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If we use &lt;strong&gt;chapters&lt;/strong&gt; as the corpus division, we get a little different top ten most evenly distributed by Gries&#39;s DP:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.0677 ὁ
0.1440 καί
0.1913 εἰμί
0.2084 εἰς
0.2117 αὐτός
0.2259 ἐν
0.2366 οὗτος
0.2378 ὅς
0.2437 δέ
0.2561 οὐ
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and obviously this is even more problematic for lower frequency words at the other end.&lt;/p&gt;
&lt;p&gt;It&#39;s interesting to look, though, at chapters within a single book. For example, here are the most evenly distributed lemmas in John&#39;s gospel using chapters for parts:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.0574 ὁ
0.0867 καί
0.0977 αὐτός
0.1331 οὐ
0.1391 οὗτος
0.1440 ὅτι
0.1480 λέγω
0.1569 δέ
0.1576 εἰμί
0.1658 εἰς
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and here are the least evenly distributed lemmas that occur at least 10 times:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.9470 σταυρόω
0.9414 Ἀβραάμ
0.9126 νίπτω
0.8958 Πιλᾶτος
0.8914 πρόβατον
0.8812 Λάζαρος
0.8493 καρπός
0.8426 ἄρτος
0.8371 προσκυνέω
0.8221 ψυχή
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Obviously Gries&#39;s DP is extremely easy to calculate, and I plan to experimentally include it in the Greek Vocabulary Tool for the Perseus Project but there are still some things to work out with low frequency words.&lt;/p&gt;
&lt;p&gt;It&#39;s very interesting, though, as a way of contrasting words that otherwise have the same frequency in a corpus. For example, here are all the lemmas that occur &lt;em&gt;exactly&lt;/em&gt; 30 times in the SBLGNT, with their book-based Gries&#39;s DP:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.3276 διδαχή
0.3558 ἐγγύς
0.3708 σκότος
0.4143 ἀγοράζω
0.5360 σκανδαλίζω
0.5833 συνέρχομαι
0.6230 ἴδε
0.6485 ἐπικαλέω
0.7266 συνείδησις
0.8952 ἀρνίον
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There is a massive range in the DP which I think is quite illustrative.&lt;/p&gt;
&lt;p&gt;Here is the list with their chapter-based DP (notice how high the lowest DP now is):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;0.8769 ἀγοράζω
0.8821 σκότος
0.8869 συνέρχομαι
0.8958 σκανδαλίζω
0.9016 ἐγγύς
0.9016 διδαχή
0.9034 ἴδε
0.9083 ἐπικαλέω
0.9441 συνείδησις
0.9609 ἀρνίον
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;One of my reasons for exploring Gries&#39;s DP (and potentially other measures of lexical dispersion) is the application to language learning. My sense is that dispersion might be a useful input to deciding what vocabulary to learn. For example διδαχή or σκότος might be better to learn before ἀρνίον because, even though they all have the same frequency, you are more likely to encounter διδαχή or σκότος in a random book or chapter.&lt;/p&gt;
&lt;p&gt;[1] Gries, Stefan Th. (2008) &lt;a href=&#34;http://www.linguistics.ucsb.edu/faculty/stgries/research/2008_STG_Dispersion_IJCL.pdf&#34;&gt;Dispersions and adjusted frequencies in corpora&lt;/a&gt;. International Journal of Corpus Linguistics 13:4. John Benjamins.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Measures of dispersion are interesting to apply to a corpus because they tell you whether a word is distributed across parts of the corpus as expected or concentrated more in just some parts. I thought I’d play around with Gries’s DP as a measure of dispersion on the SBLGNT lemmas.</summary>
  </entry><entry>
    <title type="html">Some Unix Command Line Exercises Using MorphGNT</title>
    <link href="https://jktauber.com/2017/12/24/some-unix-command-line-exercises-using-morphgnt/" rel="alternate" type="text/html" title="Some Unix Command Line Exercises Using MorphGNT"/>
    <published>2017-12-24</published>
    <updated>2017-12-24</updated>
    <id>https://jktauber.com/2017/12/24/some-unix-command-line-exercises-using-morphgnt</id>
    <content type="html" xml:base="https://jktauber.com/2017/12/24/some-unix-command-line-exercises-using-morphgnt/">&lt;p&gt;I thought I’d help a friend learn some basic Unix command line (although pretty comprehensive for this type of work) with some practical graded exercises using MorphGNT. It worked out well so I thought I’d share in case they are useful to others.&lt;/p&gt;
&lt;p&gt;The point here is not to actually teach how to use &lt;code&gt;bash&lt;/code&gt; or commands like &lt;code&gt;grep&lt;/code&gt;, &lt;code&gt;awk&lt;/code&gt;, &lt;code&gt;cut&lt;/code&gt;, &lt;code&gt;sort&lt;/code&gt;, &lt;code&gt;uniq&lt;/code&gt;, &lt;code&gt;head&lt;/code&gt; or &lt;code&gt;wc&lt;/code&gt; but rather to motivate their use in a gradual fashion with real use cases and to structure what to actually look up when learning how to use them.&lt;/p&gt;
&lt;p&gt;This little set of commands has served me well for over twenty years working with MorphGNT in its various iterations (although I obviously switch to Python for anything more complex).&lt;/p&gt;
&lt;h3&gt;Task 0&lt;/h3&gt;
&lt;p&gt;Clone &lt;a href=&#34;https://github.com/morphgnt/sblgnt&#34;&gt;https://github.com/morphgnt/sblgnt&lt;/a&gt; in git.&lt;/p&gt;
&lt;h3&gt;Task 1&lt;/h3&gt;
&lt;p&gt;Using &lt;code&gt;wc&lt;/code&gt; and the concept of wildcards/globbing (and relying on the fact I have one line-per-word in those files) work out how many words are in the main text of SBLGNT.&lt;/p&gt;
&lt;h3&gt;Task 2&lt;/h3&gt;
&lt;p&gt;Using &lt;code&gt;grep&lt;/code&gt; and &lt;code&gt;wc&lt;/code&gt; work out how many times μονογενής appears. (You might be able to do it with just &lt;code&gt;grep&lt;/code&gt; and appropriate options, but try using &lt;code&gt;grep&lt;/code&gt; without options and &lt;code&gt;wc&lt;/code&gt; and understand the concept of &#34;piping&#34; the output of one command to the input of another)&lt;/p&gt;
&lt;h3&gt;Task 3&lt;/h3&gt;
&lt;p&gt;How many verbs (tokens) are there in John’s gospel? (still doable just with &lt;code&gt;grep&lt;/code&gt; and &lt;code&gt;wc&lt;/code&gt;)&lt;/p&gt;
&lt;h3&gt;Task 4&lt;/h3&gt;
&lt;p&gt;How many &lt;em&gt;unique&lt;/em&gt; verbs (lemmas) are there in John’s gospel?&lt;/p&gt;
&lt;p&gt;(learn how to use &lt;code&gt;awk&lt;/code&gt; to extract fields, and how to use &lt;code&gt;sort&lt;/code&gt; and &lt;code&gt;uniq&lt;/code&gt; in tandem)&lt;/p&gt;
&lt;h3&gt;Task 5&lt;/h3&gt;
&lt;p&gt;What are the 5 most common verbs (lemmas) in John’s gospel? (you might want to use &lt;code&gt;head&lt;/code&gt;)&lt;/p&gt;
&lt;h3&gt;Task 6&lt;/h3&gt;
&lt;p&gt;Get counts in John’s Gospel of how many tokens appear in each tense/aspect (hint: use &lt;code&gt;cut&lt;/code&gt;) and write the results to a file called &lt;code&gt;john.txt&lt;/code&gt; rather than just output it in the terminal.&lt;/p&gt;
&lt;h3&gt;Task 7&lt;/h3&gt;
&lt;p&gt;Come up with your own question that you think could be answered using the types of operations and try it out.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I thought I’d help a friend learn some basic Unix command line (although pretty comprehensive for this type of work) with some practical graded exercises using MorphGNT. It worked out well so I thought I’d share in case they are useful to others.</summary>
  </entry><entry>
    <title type="html">SBL Papers Now Online</title>
    <link href="https://jktauber.com/2017/11/22/sbl-papers-now-online/" rel="alternate" type="text/html" title="SBL Papers Now Online"/>
    <published>2017-11-22</published>
    <updated>2017-11-22</updated>
    <id>https://jktauber.com/2017/11/22/sbl-papers-now-online</id>
    <content type="html" xml:base="https://jktauber.com/2017/11/22/sbl-papers-now-online/">&lt;p&gt;I’ve put my two SBL papers this year (from both the recent Annual Meeting and the International Meeting) online and also sync’d my Annual Meeting slides to audio I recorded on my iPhone.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;SBL 2017 Annual&lt;/em&gt;: &lt;strong&gt;Linking Lexical Resources for Biblical Greek&lt;/strong&gt; &lt;br&gt;&lt;a href=&#34;https://www.academia.edu/35220175/Linking_Lexical_Resources_for_Biblical_Greek&#34;&gt;[slides]&lt;/a&gt; &lt;a href=&#34;https://vimeo.com/243936959&#34;&gt;[video]&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;SBL 2017 International&lt;/em&gt;: &lt;strong&gt;The Route to Adaptive Learning of Greek&lt;/strong&gt; &lt;br&gt;&lt;a href=&#34;https://www.academia.edu/35220134/The_Route_to_Adaptive_Learning_of_Greek&#34;&gt;[slides]&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For completeness, here are my other SBL talks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;SBL 2016 Annual&lt;/em&gt;: &lt;strong&gt;An Online Adaptive Reading Environment for the Greek New Testament&lt;/strong&gt; &lt;br&gt;&lt;a href=&#34;https://www.academia.edu/30722025/An_Online_Adaptive_Reading_Environment_for_the_Greek_New_Testament&#34;&gt;[slides]&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;SBL 2015 Annual&lt;/em&gt;: &lt;strong&gt;A Morphological Lexicon of New Testament Greek&lt;/strong&gt; &lt;br&gt;&lt;a href=&#34;https://www.academia.edu/18816954/A_Morphological_Lexicon_of_New_Testament_Greek&#34;&gt;[slides]&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I’ve put my two SBL papers this year (from both the recent Annual Meeting and the International Meeting) online and also sync’d my Annual Meeting slides to audio I recorded on my iPhone.</summary>
  </entry><entry>
    <title type="html">Speaking at SBL 2017 on Linking Lexical Resources</title>
    <link href="https://jktauber.com/2017/11/18/speaking-sbl-2017-linking-lexical-resources/" rel="alternate" type="text/html" title="Speaking at SBL 2017 on Linking Lexical Resources"/>
    <published>2017-11-18</published>
    <updated>2017-11-18</updated>
    <id>https://jktauber.com/2017/11/18/speaking-sbl-2017-linking-lexical-resources</id>
    <content type="html" xml:base="https://jktauber.com/2017/11/18/speaking-sbl-2017-linking-lexical-resources/">&lt;p&gt;I’m again speaking at the SBL Annual Meeting, this time in Boston. My topic is basically the “lemma lattice” work started by Ulrik Sandborg-Petersen and I back in 2006 but which I’ve never presented in this sort of setting before.&lt;/p&gt;
&lt;p&gt;Here&#39;s the official abstract:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Linking Lexical Resources for Biblical Greek&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As more resources for Biblical Greek, both old and new, become openly available, the opportunities for integrating them become greater. At the level of the word, it might seem a trivial task to match based on lemma. But no two texts are lemmatised the same way and no two lexicons will make the same choices of headwords. Numerical solutions such as Strongs and Goodrick-Kohlenberger solve some problems but introduce new ones. After surveying the various issues and challenges, this talk will provide both a framework for moving forward and a report on practical ways that a variety of texts, lexicons, and other resources such as principal-part lists are being linked in the service of open, biblical digital humanities.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I&#39;ll certainly post my slides after my talk but I&#39;ll also try to record it on my iPhone like I did at BibleTech 2015.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I’m again speaking at the SBL Annual Meeting, this time in Boston. My topic is basically the “lemma lattice” work started by Ulrik Sandborg-Petersen and I back in 2006 but which I’ve never presented in this sort of setting before.</summary>
  </entry><entry>
    <title type="html">Four Types of But</title>
    <link href="https://jktauber.com/2017/11/03/four-types-but/" rel="alternate" type="text/html" title="Four Types of But"/>
    <published>2017-11-03</published>
    <updated>2017-11-03</updated>
    <id>https://jktauber.com/2017/11/03/four-types-but</id>
    <content type="html" xml:base="https://jktauber.com/2017/11/03/four-types-but/">&lt;p&gt;In his talk on adversive conjunction in Gothic at the 29th UCLA Indo-European Conference, Jared Klein started with a wonderful example paragraph in English.&lt;/p&gt;
&lt;blockquote&gt;
In order to finish the project, I don&#39;t need money &lt;i&gt;but&lt;/i&gt;&lt;sub&gt;2&lt;/sub&gt; time. I would like to be done by the end of this year, &lt;i&gt;but&lt;/i&gt;&lt;sub&gt;3&lt;/sub&gt; I don&#39;t think that is going to happen. Nobody is to blame for this &lt;i&gt;but&lt;/i&gt;&lt;sub&gt;1&lt;/sub&gt; me, because I&#39;ve wasted a lot of time on things that have proved to be irrelevant. &lt;i&gt;But&lt;/i&gt;&lt;sub&gt;4&lt;/sub&gt; this is too depressing; let&#39;s talk about something else.
&lt;/blockquote&gt;

&lt;p&gt;He went on to talk about the Gothic equivalents for each but I thought it was a great illustration of four distinct types of adversatives all using &#34;but&#34; in English.&lt;/p&gt;
&lt;p&gt;Klein didn&#39;t necessarily use the following terms but the four could be described as:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;prepositional&lt;/li&gt;
&lt;li&gt;phrasal&lt;/li&gt;
&lt;li&gt;clausal&lt;/li&gt;
&lt;li&gt;discourse&lt;/li&gt;
&lt;/ol&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In his talk on adversive conjunction in Gothic at the 29th UCLA Indo-European Conference, Jared Klein started with a wonderful example paragraph in English.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 19</title>
    <link href="https://jktauber.com/2017/11/02/tour-greek-morphology-part-19/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 19"/>
    <published>2017-11-02</published>
    <updated>2017-11-02</updated>
    <id>https://jktauber.com/2017/11/02/tour-greek-morphology-part-19</id>
    <content type="html" xml:base="https://jktauber.com/2017/11/02/tour-greek-morphology-part-19/">&lt;p&gt;Part nineteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;It&#39;s now time to do for the middle forms what we did for the actives in &lt;a href=&#34;/2017/09/07/tour-greek-morphology-part-16/&#34;&gt;part 16&lt;/a&gt;, namely come up with the rules to help disambiguate inflectional classes. These were sketched out in theory in &lt;a href=&#34;/2017/08/29/tour-greek-morphology-part-14/&#34;&gt;part 14&lt;/a&gt; but now it&#39;s time to actually write the rules and test them in code against the SBLGNT.&lt;/p&gt;
&lt;p&gt;This is what my Python script does:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;INF&lt;/b&gt;:Xεσθαι or
      &lt;b&gt;3SG&lt;/b&gt;:Xεται or
      &lt;b&gt;2PL&lt;/b&gt;:Xεσθε
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-1&lt;/b&gt; if lemma ends in ω or ομαι&lt;br&gt;
      &lt;b&gt;PM-7&lt;/b&gt; if lemma ends ημι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:Xομαι or
      &lt;b&gt;1PL&lt;/b&gt;:Xόμεθα or
      &lt;b&gt;3PL&lt;/b&gt;:Xονται
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-8&lt;/b&gt; if lemma ends in δίδομαι&lt;br&gt;
      &lt;b&gt;PM-1&lt;/b&gt; if lemma ends in ω or otherwise ends in ομαι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:Xοῦμαι or
      &lt;b&gt;3PL&lt;/b&gt;:Xοῦνται
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-2&lt;/b&gt; if lemma ends in έω or έομαι&lt;br&gt;
      &lt;b&gt;PM-3&lt;/b&gt; if lemma ends in όω or όομαι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:Xῶμαι or
      &lt;b&gt;1PL&lt;/b&gt;:Xώμεθα or
      &lt;b&gt;3PL&lt;/b&gt;:Xῶνται
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-5&lt;/b&gt; if lemma ends in χράομαι&lt;br&gt;
      &lt;b&gt;PM-4&lt;/b&gt; if lemma otherwise ends in άομαι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;2SG&lt;/b&gt;:Xῇ
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-2&lt;/b&gt; if lemma ends in έω or έομαι&lt;br&gt;
      &lt;b&gt;PM-5&lt;/b&gt; if lemma ends in άομαι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1PL&lt;/b&gt;:Xύμεθα
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-2&lt;/b&gt; if lemma ends in έω or έομαι&lt;br&gt;
      &lt;b&gt;PM-3&lt;/b&gt; if lemma ends in όω or όομαι (not needed in SBLGNT)&lt;br&gt;
      &lt;b&gt;PM-5&lt;/b&gt; otherwise (not needed in SBLGNT)
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;3SG&lt;/b&gt;:Xεῖται or
      &lt;b&gt;2PL&lt;/b&gt;:Xεῖσθε
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-2&lt;/b&gt; if lemma ends in έω or έομαι&lt;br&gt;
      &lt;b&gt;PM-11&lt;/b&gt; if lemma ends in εῖμαι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1PL&lt;/b&gt;:Xείμεθα
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-11&lt;/b&gt; if lemma is κεῖμαι&lt;br&gt;
      &lt;b&gt;PM-11-COMPOUND&lt;/b&gt; otherwise (not needed in SBLGNT)
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;INF&lt;/b&gt;:Xεῖσθαι
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-2&lt;/b&gt; if lemma ends in έω or έομαι&lt;br&gt;
      &lt;b&gt;PM-11&lt;/b&gt; if lemma is κεῖμαι (not needed in SBLGNT)&lt;br&gt;
      &lt;b&gt;PM-11-COMPOUND&lt;/b&gt; otherwise
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;INF&lt;/b&gt;:Xῆσθαι
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PM-10-COMPOUND&lt;/b&gt; if lemma is κάθημαι&lt;br&gt;
      &lt;b&gt;PM-5&lt;/b&gt; otherwise (not needed in SBLGNT)
    &lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;I decided to cover a bunch of ambiguities not specifically needed by the SBLGNT—not strictly necessary but it will help when the script is extended to run on a larger corpus.&lt;/p&gt;
&lt;p&gt;Note the special-casing of δίδομαι, κεῖμαι, κάθημαι, and χράομαι. χράομαι is an example, like ζάω in &lt;a href=&#34;/2017/09/07/tour-greek-morphology-part-16/&#34;&gt;part 16&lt;/a&gt;, that is misleadingly lemmatized with an alpha. More on that later!&lt;/p&gt;
&lt;p&gt;We now have an inflectional class for all 820 present middle infinitive or indicative forms in the MorphGNT SBLGNT.&lt;/p&gt;
&lt;p&gt;You can download the entire output of my Python script &lt;a href=&#34;https://gist.github.com/jtauber/accb8180f56fceee37f57a040faa4b8a&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Are there multiple classes for a particular lexeme (like there was in the active)?&lt;/p&gt;
&lt;p&gt;Two of the 167 lexemes show multiple classes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;δύναμαι: &lt;strong&gt;PM-9&lt;/strong&gt; normally but a &lt;strong&gt;2SG&lt;/strong&gt;:δύνῃ that comes up as a &lt;strong&gt;PM-1&lt;/strong&gt; (&lt;strong&gt;PM-9&lt;/strong&gt; would predict a Xασαι)&lt;/li&gt;
&lt;li&gt;κάθημαι: &lt;strong&gt;PM-10-COMPOUND&lt;/strong&gt; normally but a &lt;strong&gt;2SG&lt;/strong&gt;:κάθῃ that comes up as a &lt;strong&gt;PM-1&lt;/strong&gt; (&lt;strong&gt;PM-10-COMPOUND&lt;/strong&gt; would predict a Xησαι)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If κάθῃ were καθῇ, we&#39;d have the possibility of reanalysis as a &lt;strong&gt;PM-5&lt;/strong&gt; and it&#39;s still possible that&#39;s what&#39;s going on and the accentuation just doesn&#39;t reflect that.&lt;/p&gt;
&lt;p&gt;δύνῃ for δύνασαι is somewhat less expected and it should be noted that both forms appear in the SBLGNT, sometimes within the same author. That the &lt;strong&gt;PM-4&lt;/strong&gt; &lt;strong&gt;2SG&lt;/strong&gt; all show up with an un-contracted ᾶσαι adds slightly more mystery.&lt;/p&gt;
&lt;p&gt;For now we&#39;ll leave δύνῃ and κάθῃ as &lt;strong&gt;PM-1&lt;/strong&gt; but we revisit them later.&lt;/p&gt;
&lt;p&gt;In the next part, we&#39;ll look at counts for the present middles across the SBLGNT.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part nineteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Off to the UCLA Indo-European Conference</title>
    <link href="https://jktauber.com/2017/11/01/ucla-indo-european-conference/" rel="alternate" type="text/html" title="Off to the UCLA Indo-European Conference"/>
    <published>2017-11-01</published>
    <updated>2017-11-01</updated>
    <id>https://jktauber.com/2017/11/01/ucla-indo-european-conference</id>
    <content type="html" xml:base="https://jktauber.com/2017/11/01/ucla-indo-european-conference/">&lt;p&gt;Tomorrow I’m off to Los Angeles for the &lt;a href=&#34;http://www.pies.ucla.edu/IECprogram.html&#34;&gt;Twenty-Ninth Annual UCLA Indo-European Conference&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Indo-European studies are notoriously impenetrable, even for linguists, but a couple of months ago, I finally decided now was the time to attend this major conference (to the extent an IE conference &lt;em&gt;can&lt;/em&gt; be &#34;major&#34;).&lt;/p&gt;
&lt;p&gt;I&#39;m not great at conferences at the best of times, especially when I&#39;m not a speaker and/or don&#39;t know very many people, so this will be quite a stepping-out-of-the-comfort-zone for me.&lt;/p&gt;
&lt;p&gt;But as an aspiring comparative philologist, I&#39;m sure it&#39;s going to be very rewarding for me.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Tomorrow I’m off to Los Angeles for the &lt;a href=&#34;http://www.pies.ucla.edu/IECprogram.html&#34;&gt;Twenty-Ninth Annual UCLA Indo-European Conference&lt;/a&gt;.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 18</title>
    <link href="https://jktauber.com/2017/10/27/tour-greek-morphology-part-18/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 18"/>
    <published>2017-10-27</published>
    <updated>2017-10-27</updated>
    <id>https://jktauber.com/2017/10/27/tour-greek-morphology-part-18</id>
    <content type="html" xml:base="https://jktauber.com/2017/10/27/tour-greek-morphology-part-18/">&lt;p&gt;Part eighteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/2017/08/26/tour-greek-morphology-part-13/&#34;&gt;Part 13&lt;/a&gt; we summarised the present active endings and in &lt;a href=&#34;/2017/09/05/tour-greek-morphology-part-15/&#34;&gt;part 15&lt;/a&gt; posed the question &#34;Do these paradigms cover all the forms in the Greek New Testament?&#34;&lt;/p&gt;
&lt;p&gt;Now we&#39;re going to answer the same question for the middle endings summarised in &lt;a href=&#34;/2017/08/29/tour-greek-morphology-part-14/&#34;&gt;part 14&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Again, I&#39;ve written a short Python program that reveals there are 16 forms in 23 instances that do NOT match.&lt;/p&gt;
&lt;p&gt;Two of these forms are of κάθημαι: the &lt;strong&gt;1SG&lt;/strong&gt; itself plus the &lt;strong&gt;3SG&lt;/strong&gt; κάθηται. The &lt;strong&gt;3SG&lt;/strong&gt; bears a resemblance to the &lt;strong&gt;PM-5&lt;/strong&gt; &lt;strong&gt;3SG&lt;/strong&gt; (differing only in accent) but this is not a circumflex verb. The existence of the η in the &lt;strong&gt;1SG&lt;/strong&gt; rather than an ῶ indicates this is an athematic verb. It is in fact a compound verb κατά+ἧμαι.&lt;/p&gt;
&lt;p&gt;We don&#39;t have a paradigm class for ἧμαι OR its compounds so let&#39;s add them now.&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;  &lt;th&gt;PM-10&lt;/th&gt;           &lt;th&gt;PM-10-COMPOUND&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;INF&lt;/th&gt;     &lt;td&gt;&lt;i&gt;ἧσθαι&lt;/i&gt;&lt;/td&gt;    &lt;td&gt;&lt;i&gt;Xῆσθαι&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1SG&lt;/th&gt;     &lt;td&gt;&lt;i&gt;ἧμαι&lt;/i&gt;&lt;/td&gt;     &lt;td&gt;Xημαι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2SG&lt;/th&gt;     &lt;td&gt;&lt;i&gt;ἧσαι&lt;/i&gt;&lt;/td&gt;     &lt;td&gt;&lt;i&gt;Xησαι&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3SG&lt;/th&gt;     &lt;td&gt;&lt;i&gt;ἧται&lt;/i&gt;&lt;/td&gt;     &lt;td&gt;Xηται&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1PL&lt;/th&gt;     &lt;td&gt;&lt;i&gt;ἥμεθα&lt;/i&gt;&lt;/td&gt;    &lt;td&gt;&lt;i&gt;Xήμεθα&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2PL&lt;/th&gt;     &lt;td&gt;&lt;i&gt;ἧσθε&lt;/i&gt;&lt;/td&gt;     &lt;td&gt;&lt;i&gt;Xησθε&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3PL&lt;/th&gt;     &lt;td&gt;&lt;i&gt;ἧνται&lt;/i&gt;&lt;/td&gt;    &lt;td&gt;&lt;i&gt;Xηνται&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;(we don&#39;t actually need &lt;strong&gt;PM-10&lt;/strong&gt; for the SBLGNT but I&#39;ve included it for completeness)&lt;/p&gt;
&lt;p&gt;Next we have κεῖμαι and ITS compounds which account for 10 more forms. Here again we have an athematic verb with a vowel we haven&#39;t covered before.&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;  &lt;th&gt;PM-11&lt;/th&gt;           &lt;th&gt;PM-11-COMPOUND&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;INF&lt;/th&gt;     &lt;td&gt;&lt;i&gt;Xεῖσθαι&lt;/i&gt;&lt;/td&gt;  &lt;td&gt;&lt;i&gt;Xεῖσθαι&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1SG&lt;/th&gt;     &lt;td&gt;Xεῖμαι&lt;/td&gt;          &lt;td&gt;&lt;i&gt;Xειμαι&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2SG&lt;/th&gt;     &lt;td&gt;&lt;i&gt;Xεῖσαι&lt;/i&gt;&lt;/td&gt;   &lt;td&gt;&lt;i&gt;Xεισαι&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3SG&lt;/th&gt;     &lt;td&gt;&lt;i&gt;Xεῖται&lt;/i&gt;&lt;/td&gt;   &lt;td&gt;Xειται&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1PL&lt;/th&gt;     &lt;td&gt;Xείμεθα&lt;/td&gt;         &lt;td&gt;&lt;i&gt;Xείμεθα&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2PL&lt;/th&gt;     &lt;td&gt;&lt;i&gt;Xεῖσθε&lt;/i&gt;&lt;/td&gt;   &lt;td&gt;&lt;i&gt;Xεισθε&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3PL&lt;/th&gt;     &lt;td&gt;&lt;i&gt;Xεῖνται&lt;/i&gt;&lt;/td&gt;  &lt;td&gt;Xεινται&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Note that &lt;strong&gt;INF&lt;/strong&gt; and &lt;strong&gt;1PL&lt;/strong&gt; are identical between the two of them (so will be an ambiguity we&#39;ll need to cover, although not for the SBLGNT).&lt;/p&gt;
&lt;p&gt;Our next word is οἶμαι which only appears in the SBLGNT in the &lt;strong&gt;1SG&lt;/strong&gt;. We won&#39;t reconstruct the entire paradigm (we may come back to it later) but will use &lt;strong&gt;PM-12&lt;/strong&gt; to designate the οἶμαι form.&lt;/p&gt;
&lt;p&gt;This leaves us with three forms, all &lt;strong&gt;2SG&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;καυχᾶσαι&lt;/li&gt;
&lt;li&gt;ὀδυνᾶσαι&lt;/li&gt;
&lt;li&gt;κατακαυχᾶσαι&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In all cases, this looks a lot like a &lt;strong&gt;PM-4&lt;/strong&gt; that just hasn&#39;t dropped the sigma in -ᾶσαι to form -ᾷ. In fact, all the &lt;strong&gt;PM-4&lt;/strong&gt;s in the SBLGNT seem to have this behaviour so we probably shouldn&#39;t treat it as a separate paradigm but rather an alternative realisation within the &lt;strong&gt;PM-4&lt;/strong&gt; &lt;strong&gt;2SG&lt;/strong&gt; cell (similar to Xῃ/Xει in the &lt;strong&gt;PM-1&lt;/strong&gt;). We&#39;ll discuss in a later post why &lt;strong&gt;PM-4&lt;/strong&gt; might exhibit this when other circumflex middle paradigms don&#39;t seem to.&lt;/p&gt;
&lt;p&gt;But with this tweak and the additions of &lt;strong&gt;PM-10&lt;/strong&gt;, &lt;strong&gt;PM-10-COMPOUND&lt;/strong&gt;, &lt;strong&gt;PM-11&lt;/strong&gt;, &lt;strong&gt;PM-11-COMPOUND&lt;/strong&gt;, and &lt;strong&gt;PM-12&lt;/strong&gt; we now have full coverage of the present middle indicatives and infinitives in the SBLGNT.&lt;/p&gt;
&lt;p&gt;You may be wondering whether we could have just identified these paradigms way back when we first laid out the different present middle paradigms. We absolutely could have. But I think the way we&#39;ve discovered them demonstrates an important concept: that of rigorously testing a linguistic model against a corpus.&lt;/p&gt;
&lt;p&gt;This whole blog series is, in fact, laying the ground work for a rigorous description of Greek morphology that has been my goal to write for many years.&lt;/p&gt;
&lt;p&gt;But coming back to the short term: we still have to explore the disambiguation of assigning inflectional classes to the middle forms, like we did for the actives in &lt;a href=&#34;/2017/09/07/tour-greek-morphology-part-16/&#34;&gt;part 16&lt;/a&gt;. We&#39;ll do that in the next part.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part eighteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 17</title>
    <link href="https://jktauber.com/2017/10/16/tour-greek-morphology-part-17/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 17"/>
    <published>2017-10-16</published>
    <updated>2017-10-16</updated>
    <id>https://jktauber.com/2017/10/16/tour-greek-morphology-part-17</id>
    <content type="html" xml:base="https://jktauber.com/2017/10/16/tour-greek-morphology-part-17/">&lt;p&gt;Part seventeen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;As mentioned in the &lt;a href=&#34;/2017/09/07/tour-greek-morphology-part-16/&#34;&gt;last post&lt;/a&gt; in the series, we now have an inflectional class for all 5,314 present active infinitive or indicative forms in the MorphGNT SBLGNT in a file that looks like the following:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;010120 ἐστί(ν) 3SG PA-10 εἰμί PA-10
010123 ἐστί(ν) 3SG PA-10 εἰμί PA-10
010202 ἐστί(ν) 3SG PA-10 εἰμί PA-10
010206 εἶ 2SG PA-10 εἰμί PA-10
010213 μέλλει 3SG PA-1 μέλλω PA-1
010213 ζητεῖν INF PA-2 ζητέω PA-2
010218 εἰσί(ν) 3PL PA-10 εἰμί PA-10
010222 βασιλεύει 3SG PA-1 βασιλεύω PA-1
010303 ἐστί(ν) 3SG PA-10 εἰμί PA-10
010309 λέγειν INF PA-1 λέγω PA-1
010309 ἔχομεν 1PL PA-1/PA-8 ἔχω PA-1
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Where the columns are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the book/chapter/verse reference&lt;/li&gt;
&lt;li&gt;the normalized form&lt;/li&gt;
&lt;li&gt;the morphosyntactic properties&lt;/li&gt;
&lt;li&gt;the inflectional classes possible without disambiguation&lt;/li&gt;
&lt;li&gt;the lemma&lt;/li&gt;
&lt;li&gt;the disambiguated inflectional class&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now it&#39;s time to do some counts.&lt;/p&gt;
&lt;p&gt;Let us first of all look at the number of distinct lemmas in each of our 13 classes.&lt;/p&gt;
&lt;p&gt;The numbers for classes &lt;strong&gt;PA-5&lt;/strong&gt; and above are low enough that we should look at them individually:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-1&lt;/th&gt;
    &lt;td&gt;barytone omega verbs&lt;/td&gt;
    &lt;td&gt;338&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-2&lt;/th&gt;
    &lt;td&gt;circumflex omega verbs with INF -εῖν / 3SG -εῖ&lt;/td&gt;
    &lt;td&gt;145&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-3&lt;/th&gt;
    &lt;td&gt;circumflex omega verbs with INF -οῦν / 3SG -οῖ&lt;/td&gt;
    &lt;td&gt;21&lt;/td&gt;
  &lt;/tr&gt;  
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-4&lt;/th&gt;
    &lt;td&gt;circumflex omega verbs with INF -ᾶν / 3SG -ᾷ&lt;/td&gt;
    &lt;td&gt;31&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-5&lt;/th&gt;
    &lt;td&gt;ζάω + compound (συζάω)&lt;/td&gt;
    &lt;td&gt;2&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-6a&lt;/th&gt;
    &lt;td&gt;ὀμνύω; δείκνυμι + compound (ἀμφιέννυμι)&lt;/td&gt;
    &lt;td&gt;3&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-7&lt;/th&gt;
    &lt;td&gt;τίθημι + compounds (ἐπιτίθημι παρατίθημι περιτίθημι);&lt;br&gt;compounds of ἵημι (ἀφίημι συνίημι)&lt;/td&gt;
    &lt;td&gt;6&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-8&lt;/th&gt;
    &lt;td&gt;δίδωμι + compounds (διαδίδωμι ἀποδίδωμι μεταδίδωμι παραδίδωμι&lt;/td&gt;
    &lt;td&gt;5&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-9&lt;/th&gt;
    &lt;td&gt;compounds of ίστημι (καθίστημι μεθίστημι συνίστημι);&lt;br&gt;compound of φημί (σύμφημι);&lt;br&gt;that one weird case of συνίημι&lt;/td&gt;
    &lt;td&gt;5&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-9-ENC&lt;/th&gt;
    &lt;td&gt;φημί&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-10&lt;/th&gt;
    &lt;td&gt;εἰμί&lt;/td&gt;
    &lt;td&gt;1&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-10-COMP&lt;/th&gt;
    &lt;td&gt;compounds of εἰμί (ἄπειμι ἔξεστι(ν) πάρειμι)&lt;/td&gt;
    &lt;td&gt;3&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th nowrap&gt;PA-11-COMP&lt;/th&gt;
    &lt;td&gt;compounds of εἶμι (ἔξειμι εἴσειμι)&lt;/td&gt;
    &lt;td&gt;2&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Notice that even the small counts are elevated due to compound verbs. Folding compounds of the same base verb, the classes from &lt;strong&gt;PA-5&lt;/strong&gt; on have only one or two members.&lt;/p&gt;
&lt;p&gt;This is just looking at the number of unique lemmas in each class but there are two other sets of numbers that are worth looking at: (1) the total number of tokens in the SBLGNT; (2) the distribution of classes amongst the hapax legomena.&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;class&lt;/th&gt;             &lt;th&gt;lemmas&lt;/th&gt; &lt;th&gt;tokens&lt;/th&gt; &lt;th&gt;hapax&lt;/th&gt; &lt;th&gt;hapax details&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-1&lt;/th&gt;       &lt;td&gt;338&lt;/td&gt;    &lt;td&gt;2563&lt;/td&gt;   &lt;td&gt;151&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-2&lt;/th&gt;       &lt;td&gt;145&lt;/td&gt;    &lt;td&gt;856&lt;/td&gt;    &lt;td&gt;65&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-3&lt;/th&gt;       &lt;td&gt;21&lt;/td&gt;     &lt;td&gt;35&lt;/td&gt;     &lt;td&gt;15&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-4&lt;/th&gt;       &lt;td&gt;31&lt;/td&gt;     &lt;td&gt;117&lt;/td&gt;    &lt;td&gt;16&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-5&lt;/th&gt;       &lt;td&gt;2&lt;/td&gt;      &lt;td&gt;41&lt;/td&gt;     &lt;td&gt;1&lt;/td&gt;     &lt;td&gt;συζάω&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-6a&lt;/th&gt;      &lt;td&gt;3&lt;/td&gt;      &lt;td&gt;5&lt;/td&gt;      &lt;td&gt;2&lt;/td&gt;     &lt;td&gt;ὀμνύω ἀμφιέννυμι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-7&lt;/th&gt;       &lt;td&gt;6&lt;/td&gt;      &lt;td&gt;37&lt;/td&gt;     &lt;td&gt;3&lt;/td&gt;     &lt;td&gt;εἴσειμι παρίστημι παρατίθημι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-8&lt;/th&gt;       &lt;td&gt;5&lt;/td&gt;      &lt;td&gt;35&lt;/td&gt;     &lt;td&gt;2&lt;/td&gt;     &lt;td&gt;διαδίδωμι μεταδίδωμι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-9&lt;/th&gt;       &lt;td&gt;5&lt;/td&gt;      &lt;td&gt;9&lt;/td&gt;      &lt;td&gt;3&lt;/td&gt;     &lt;td&gt;συνίημι σύμφημι μεθίστημι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-9-ENC&lt;/th&gt;   &lt;td&gt;1&lt;/td&gt;      &lt;td&gt;22&lt;/td&gt;     &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-10&lt;/th&gt;      &lt;td&gt;1&lt;/td&gt;      &lt;td&gt;1551&lt;/td&gt;   &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-10-COMP&lt;/th&gt; &lt;td&gt;3&lt;/td&gt;      &lt;td&gt;39&lt;/td&gt;     &lt;td&gt;1&lt;/td&gt;     &lt;td&gt;ἄπειμι&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th nowrap&gt;PA-11-COMP&lt;/th&gt; &lt;td&gt;2&lt;/td&gt;      &lt;td&gt;4&lt;/td&gt;      &lt;td&gt;1&lt;/td&gt;     &lt;td&gt;εἴσειμι&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Why do the hapax legomena matter? Well they give an indication of what classes were still productive.&lt;/p&gt;
&lt;p&gt;Note, however, that the hapax in &lt;strong&gt;PA-5&lt;/strong&gt; and above are VERY low in number and, with the exception of ὀμνύω in &lt;strong&gt;PA-6a&lt;/strong&gt; they are all compounds. This strongly suggests that only &lt;strong&gt;PA-1&lt;/strong&gt;, &lt;strong&gt;PA-2&lt;/strong&gt;, &lt;strong&gt;PA-3&lt;/strong&gt;, and &lt;strong&gt;PA-4&lt;/strong&gt; were productive.&lt;/p&gt;
&lt;p&gt;Notice that the token numbers for &lt;strong&gt;PA-6a&lt;/strong&gt;, &lt;strong&gt;PA-9&lt;/strong&gt; and &lt;strong&gt;PA-11-COMP&lt;/strong&gt; are particularly low too. Potentially relevant in the case of &lt;strong&gt;PA-6a&lt;/strong&gt; and &lt;strong&gt;PA-9&lt;/strong&gt; is that these are the classes most like to have developed thematic alternatives. This might be worthy of a future post in this series!&lt;/p&gt;
&lt;p&gt;Let&#39;s now look at counts for each paradigm cell for each class:&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;  &lt;th nowrap&gt;PA-1&lt;/th&gt;    &lt;th nowrap&gt;PA-2&lt;/th&gt;    &lt;th nowrap&gt;PA-3&lt;/th&gt;    &lt;th nowrap&gt;PA-4&lt;/th&gt;    &lt;th nowrap&gt;PA-5&lt;/th&gt;    &lt;th nowrap&gt;PA-6a&lt;/th&gt;   &lt;th nowrap&gt;PA-7&lt;/th&gt;    &lt;th nowrap&gt;PA-8&lt;/th&gt;    &lt;th nowrap&gt;PA-9&lt;/th&gt;    &lt;th nowrap&gt;PA-9-ENC&lt;/th&gt;   &lt;th nowrap&gt;PA-10&lt;/th&gt;   &lt;th nowrap&gt;PA-10-COMP&lt;/th&gt; &lt;th nowrap&gt;PA-11-COMP&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;INF&lt;/th&gt;     &lt;td&gt;394&lt;/td&gt;     &lt;td&gt;171&lt;/td&gt;     &lt;td&gt;5&lt;/td&gt;       &lt;td&gt;21&lt;/td&gt;      &lt;td&gt;13&lt;/td&gt;      &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;11&lt;/td&gt;      &lt;td&gt;10&lt;/td&gt;      &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;124&lt;/td&gt;     &lt;td&gt;3&lt;/td&gt;       &lt;td&gt;3&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1SG&lt;/th&gt;     &lt;td&gt;460&lt;/td&gt;     &lt;td&gt;116&lt;/td&gt;     &lt;td&gt;3&lt;/td&gt;       &lt;td&gt;21&lt;/td&gt;      &lt;td&gt;6&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;7&lt;/td&gt;       &lt;td&gt;10&lt;/td&gt;      &lt;td&gt;2&lt;/td&gt;       &lt;td&gt;4&lt;/td&gt;       &lt;td&gt;138&lt;/td&gt;     &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2SG&lt;/th&gt;     &lt;td&gt;164&lt;/td&gt;     &lt;td&gt;46&lt;/td&gt;      &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;5&lt;/td&gt;       &lt;td&gt;2&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;92&lt;/td&gt;      &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3SG&lt;/th&gt;     &lt;td&gt;923&lt;/td&gt;     &lt;td&gt;295&lt;/td&gt;     &lt;td&gt;16&lt;/td&gt;      &lt;td&gt;35&lt;/td&gt;      &lt;td&gt;13&lt;/td&gt;      &lt;td&gt;3&lt;/td&gt;       &lt;td&gt;11&lt;/td&gt;      &lt;td&gt;13&lt;/td&gt;      &lt;td&gt;5&lt;/td&gt;       &lt;td&gt;17&lt;/td&gt;      &lt;td&gt;896&lt;/td&gt;     &lt;td&gt;31&lt;/td&gt;      &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;1PL&lt;/th&gt;     &lt;td&gt;141&lt;/td&gt;     &lt;td&gt;52&lt;/td&gt;      &lt;td&gt;2&lt;/td&gt;       &lt;td&gt;19&lt;/td&gt;      &lt;td&gt;5&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;52&lt;/td&gt;      &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;2PL&lt;/th&gt;     &lt;td&gt;218&lt;/td&gt;     &lt;td&gt;99&lt;/td&gt;      &lt;td&gt;4&lt;/td&gt;       &lt;td&gt;8&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;4&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;93&lt;/td&gt;      &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;3PL&lt;/th&gt;     &lt;td&gt;263&lt;/td&gt;     &lt;td&gt;77&lt;/td&gt;      &lt;td&gt;5&lt;/td&gt;       &lt;td&gt;8&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;-&lt;/td&gt;       &lt;td&gt;3&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;156&lt;/td&gt;     &lt;td&gt;1&lt;/td&gt;       &lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;  &lt;th&gt;2563&lt;/th&gt;    &lt;th&gt;856&lt;/th&gt;     &lt;th&gt;35&lt;/th&gt;      &lt;th&gt;117&lt;/th&gt;     &lt;th&gt;41&lt;/th&gt;      &lt;th&gt;5&lt;/th&gt;       &lt;th&gt;37&lt;/th&gt;      &lt;th&gt;35&lt;/th&gt;      &lt;th&gt;9&lt;/th&gt;       &lt;th&gt;22&lt;/th&gt;      &lt;th&gt;1551&lt;/th&gt;    &lt;th&gt;39&lt;/th&gt;     &lt;th&gt;4&lt;/th&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;What is obvious from this is just how important, regardless of inflectional class, the &lt;strong&gt;3SG&lt;/strong&gt; form is. The &lt;strong&gt;INF&lt;/strong&gt; is also very important. We&#39;ve seen in a previous post that both cells are very good predictors of inflectional class (much better than &lt;strong&gt;1SG&lt;/strong&gt;) but they are also just both very common. The &lt;strong&gt;1SG&lt;/strong&gt;, despite being a bad predictor, is still important in terms of frequency.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;3PL&lt;/strong&gt; is a distant fourth with one apparent deviation: it is very common in &lt;strong&gt;PA-10&lt;/strong&gt; (i.e. the copula), more so than the &lt;strong&gt;INF&lt;/strong&gt; or &lt;strong&gt;1SG&lt;/strong&gt;. In fact, the proportion of &lt;strong&gt;3PL&lt;/strong&gt; in this class is actually average, it&#39;s the &lt;strong&gt;INF&lt;/strong&gt; and &lt;strong&gt;1SG&lt;/strong&gt; that are unusually low (with much of the frequency drop taken up by the &lt;strong&gt;3SG&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;As well as εἰμί, φημί (&lt;strong&gt;PA-9-ENC&lt;/strong&gt;) is also disproportionately &lt;strong&gt;3SG&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Of course, given how common &lt;strong&gt;PA-1&lt;/strong&gt; is, even the plurals there outnumber the most common cells in the other classes.&lt;/p&gt;
&lt;p&gt;If the goal is just to identify the person/number, not the class, (which is true in reception but not learning) then a lot of those numbers collapse because of shared endings. Here are the counts just focused on the common endings (without accents):&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;td rowspan=&#34;2&#34;&gt;INF&lt;/td&gt;   &lt;td&gt;-ν&lt;/td&gt;          &lt;td&gt;604&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                    &lt;td&gt;-ναι&lt;/td&gt;        &lt;td&gt;153&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td rowspan=&#34;2&#34;&gt;1SG&lt;/td&gt;   &lt;td&gt;-ω&lt;/td&gt;          &lt;td&gt;606&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                    &lt;td&gt;-μι&lt;/td&gt;         &lt;td&gt;163&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td rowspan=&#34;3&#34;&gt;2SG&lt;/td&gt;   &lt;td&gt;-{ι}ς&lt;/td&gt;       &lt;td&gt;217&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                    &lt;td&gt;-ς&lt;/td&gt;          &lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                    &lt;td&gt;(-)ει&lt;/td&gt;       &lt;td&gt;93&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td rowspan=&#34;3&#34;&gt;3SG&lt;/td&gt;   &lt;td&gt;-{ι}&lt;/td&gt;        &lt;td&gt;1282&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                    &lt;td&gt;-σι(ν)&lt;/td&gt;      &lt;td&gt;49&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                    &lt;td&gt;(-)εστι(ν)&lt;/td&gt;  &lt;td&gt;927&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;1PL&lt;/td&gt;             &lt;td&gt;-μεν&lt;/td&gt;        &lt;td&gt;273&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;2PL&lt;/td&gt;             &lt;td&gt;-τε&lt;/td&gt;         &lt;td&gt;448&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td rowspan=&#34;2&#34;&gt;3PL&lt;/td&gt;   &lt;td&gt;-σι(ν)&lt;/td&gt;      &lt;td&gt;511&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;                    &lt;td&gt;-ασι(ν)&lt;/td&gt;     &lt;td&gt;7&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;This just emphasises even more (even though it was in the previous table) that there is only 1 &lt;strong&gt;2SG&lt;/strong&gt; in -ς (without an iota, subscripted or otherwise): the παραδίδως in Luke 22.48.&lt;/p&gt;
&lt;p&gt;The 7 &lt;strong&gt;3PL&lt;/strong&gt;s in -ασι(ν) are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;τιθέασι(ν) in Matt 5.15&lt;/li&gt;
&lt;li&gt;ἐπιτιθέασι(ν) in Matt 23.4&lt;/li&gt;
&lt;li&gt;περιτιθέασι(ν) in Mark 15.17&lt;/li&gt;
&lt;li&gt;φασί(ν) in Rom 3.8&lt;/li&gt;
&lt;li&gt;συνιᾶσι(ν) in 2Co 10.12&lt;/li&gt;
&lt;li&gt;εἰσίασι(ν) in Heb 9.6&lt;/li&gt;
&lt;li&gt;διδόασι(ν) in Rev 17.13&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One &lt;em&gt;could&lt;/em&gt; argue that these are subsumed by saying &lt;strong&gt;3PL&lt;/strong&gt; ends in -σι(ν) but given that, in the very same lexemes, -σι(ν) can also indicate &lt;strong&gt;3SG&lt;/strong&gt;, it is useful calling out the α, even though the root vowel alternation is enough to distinguish singular and plural.&lt;/p&gt;
&lt;p&gt;That&#39;s it (for now) for counts of the present actives. In the next couple of posts, we&#39;ll turn to the middle forms.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part seventeen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">pyuca 1.2 Released with Support for New Versions of Unicode</title>
    <link href="https://jktauber.com/2017/09/25/pyuca-12-released-support-new-versions-unicode/" rel="alternate" type="text/html" title="pyuca 1.2 Released with Support for New Versions of Unicode"/>
    <published>2017-09-25</published>
    <updated>2017-09-25</updated>
    <id>https://jktauber.com/2017/09/25/pyuca-12-released-support-new-versions-unicode</id>
    <content type="html" xml:base="https://jktauber.com/2017/09/25/pyuca-12-released-support-new-versions-unicode/">&lt;p&gt;pyuca is my pure-Python implementation of the Unicode Collation Algorithm—a library I use almost every day to properly sort Greek (although the library is not Greek-specific). I was recently asked how to use pyuca with a more recent DUCET than 6.3.0. That led to me needing to make a number of changes to the core code so it now supports 8.0.0, 9.0.0 and 10.0.0 as long as you have the right Python version.&lt;/p&gt;
&lt;p&gt;pyuca has always supported custom collation element tables, but when someone tried the DUCET from Unicode 8.0.0, the test suite failed.&lt;/p&gt;
&lt;p&gt;At first I thought perhaps that was because the test suite is from 6.3.0 (or 5.2.0 if running Python 2.7) but when I got around to trying the 8.0.0 test suite on the 8.0.0 DUCET it too failed.&lt;/p&gt;
&lt;p&gt;It turned out to be that a few changes were made by the Unicode Consortium to what code points are considered CJK Unified Ideographs. This is hard-coded in pyuca because it&#39;s required for implementing the implicit weight calculations (weights for certain CJK ideographs are calculated programmatically rather than explicitly listed in the DUCET).&lt;/p&gt;
&lt;p&gt;In 9.0.0 the collation element table format was slightly changed to add a new @implicitweights directive so for things to work with 9.0.0, I had to implement that. Then in 10.0.0, more changes were made to what code points are considered CJK Unified Ideographs.&lt;/p&gt;
&lt;p&gt;It didn&#39;t stop there, though. Because pyuca relies on Python&#39;s &lt;code&gt;unicodedata&lt;/code&gt; library for getting information on character categories, certain versions of Python won&#39;t work with certain versions of Unicode.&lt;/p&gt;
&lt;p&gt;So I added some logic (both to pyuca itself, and to the test suite) to use the appropriate collation code (with the right implicit weight calculations) and appropriate DUCET depending on what version of Python you are running.&lt;/p&gt;
&lt;p&gt;Some of this dispatching-based-on-Python-version had already been written by Chris Beaven, Paul McLanahan, and Michal Čihař as part of their backporting of pyuca to 2.7 (after I&#39;d declared I&#39;d only support 3). So I just extended this with the following results:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Python 2.7: test and use 5.2.0&lt;/li&gt;
&lt;li&gt;Python 3.3: test 5.2.0, 6.3.0 and use 6.3.0 by default&lt;/li&gt;
&lt;li&gt;Python 3.4: test 5.2.0, 6.3.0 and use 6.3.0 by default&lt;/li&gt;
&lt;li&gt;Python 3.5: test 5.2.0, 6.3.0, 8.0.0 and use 8.0.0 by default&lt;/li&gt;
&lt;li&gt;Python 3.6: test 5.2.0, 6.3.0, 8.0.0, 9.0.0 and use 9.0.0 by default&lt;/li&gt;
&lt;li&gt;Python 3.7-dev: test 5.2.0, 6.3.0, 8.0.0, 9.0.0, 10.0.0 (so we&#39;re ready)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;pyuca 1.2 has now been released and is available on PyPI. The repository is at &lt;a href=&#34;https://github.com/jtauber/pyuca&#34;&gt;https://github.com/jtauber/pyuca&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">pyuca is my pure-Python implementation of the Unicode Collation Algorithm—a library I use almost every day to properly sort Greek (although the library is not Greek-specific). I was recently asked how to use pyuca with a more recent DUCET than 6.3.0. That led to me needing to make a number of changes to the core code so it now supports 8.0.0, 9.0.0 and 10.0.0 as long as you have the right Python version.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 16</title>
    <link href="https://jktauber.com/2017/09/07/tour-greek-morphology-part-16/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 16"/>
    <published>2017-09-07</published>
    <updated>2017-09-07</updated>
    <id>https://jktauber.com/2017/09/07/tour-greek-morphology-part-16</id>
    <content type="html" xml:base="https://jktauber.com/2017/09/07/tour-greek-morphology-part-16/">&lt;p&gt;Part sixteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the &lt;a href=&#34;/2017/09/05/tour-greek-morphology-part-15/&#34;&gt;previous post&lt;/a&gt; we went through and made sure we had all our active endings covered ready for counting. As pointed out (and in detail in &lt;a href=&#34;/2017/08/26/tour-greek-morphology-part-13/&#34;&gt;Part 13&lt;/a&gt;), though, we still had some ambiguities. If we want to assign just a single inflectional class to each form in the SBLGNT, we need some way of disambiguating. Fortunately, the lemma does this (even if it resorts to using fake forms like the uncontracted circumflex &lt;strong&gt;1SG&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;This allows us to write code that basically follows these rules:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:Xημι or
      &lt;b&gt;3SG&lt;/b&gt;:Xησι(ν)
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PA-7&lt;/b&gt; if lemma ends in τίθημι or ίημι&lt;br&gt;
      &lt;b&gt;PA-9&lt;/b&gt; if lemma ends in ίστημι or φημι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1PL&lt;/b&gt;:Xῶμεν or
      &lt;b&gt;3PL&lt;/b&gt;:Xῶσι(ν)
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PA-5&lt;/b&gt; if lemma is ζάω&lt;br&gt;
      &lt;b&gt;PA-4&lt;/b&gt; otherwise
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1PL&lt;/b&gt;:Xοῦμεν or
      &lt;b&gt;3PL&lt;/b&gt;:Xοῦσι(ν)
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PA-2&lt;/b&gt; if lemma ends in έω&lt;br&gt;
      &lt;b&gt;PA-3&lt;/b&gt; if lemma ends in όω
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;2PL&lt;/b&gt;:Xετε
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PA-1&lt;/b&gt; if lemma ends in ω&lt;br&gt;
      &lt;b&gt;PA-7&lt;/b&gt; if lemma ends in ημι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1PL&lt;/b&gt;:Xομεν
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PA-1&lt;/b&gt; if lemma ends in ω&lt;br&gt;
      &lt;b&gt;PA-8&lt;/b&gt; if lemma ends in ωμι
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;1SG&lt;/b&gt;:Xῶ
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PA-2&lt;/b&gt; if lemma ends in έω&lt;br&gt;
      &lt;b&gt;PA-3&lt;/b&gt; if lemma ends in όω&lt;br&gt;
      &lt;b&gt;PA-5&lt;/b&gt; if lemma is ζάω&lt;br&gt;
      &lt;b&gt;PA-4&lt;/b&gt; if lemma otherwise ends in άω
    &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;
      &lt;b&gt;INF&lt;/b&gt;:Xέναι
    &lt;/td&gt;
    &lt;td&gt;&lt;i&gt;is&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;
      &lt;b&gt;PA-7&lt;/b&gt; if lemma ends with ίημι&lt;br&gt;
      &lt;b&gt;PA-11-COMPOUND&lt;/b&gt; if lemma ends with ειμι
    &lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Part 13 also mentioned the &lt;strong&gt;2SG&lt;/strong&gt;:Xης ambiguity between &lt;strong&gt;PA-7&lt;/strong&gt; and &lt;strong&gt;PA-9&lt;/strong&gt; but that doesn&#39;t crop up in the SBLGNT: there are in fact no &lt;strong&gt;PA-7&lt;/strong&gt; OR &lt;strong&gt;PA-9&lt;/strong&gt; &lt;strong&gt;2SG&lt;/strong&gt;s in the SBLGNT.&lt;/p&gt;
&lt;p&gt;There ARE however three &lt;strong&gt;1PL&lt;/strong&gt; forms which do still cause a problem with the rules above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἀφίομεν&lt;/li&gt;
&lt;li&gt;ἱστάνομεν&lt;/li&gt;
&lt;li&gt;συνιστάνομεν&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each of these matches &lt;strong&gt;1PL&lt;/strong&gt;:Xομεν BUT the MorphGNT lemmas are ἀφίημι, ἵστημι, and συνίστημι respectively.&lt;/p&gt;
&lt;p&gt;What is happening here is that new forms have developed belonging to a different inflectional class than the particular form chosen for the lemma. For example ἱστάνομεν is an ω verb but it&#39;s otherwise the same as the athematic ἵστημι. Arguably the MorphGNT lemmatization could be changed to ἱστάνω if you consider a difference in inflectional class to be a new lexeme. This is a topic I&#39;ll be covering in my talk at SBL 2017 in Boston in November. For now, in our Python code, we&#39;ll just special-case these as &lt;strong&gt;PA-1&lt;/strong&gt; but we will come back to discussing this more. Note that we only caught this here because it was an ambiguous form so we were checking for particular lemma patterns.&lt;/p&gt;
&lt;p&gt;We now have an inflectional class for all 5,314 present active infinitive or indicative forms in the MorphGNT SBLGNT.&lt;/p&gt;
&lt;p&gt;The output of my Python script begins:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;010120 ἐστί(ν) 3SG PA-10 εἰμί PA-10
010123 ἐστί(ν) 3SG PA-10 εἰμί PA-10
010202 ἐστί(ν) 3SG PA-10 εἰμί PA-10
010206 εἶ 2SG PA-10 εἰμί PA-10
010213 μέλλει 3SG PA-1 μέλλω PA-1
010213 ζητεῖν INF PA-2 ζητέω PA-2
010218 εἰσί(ν) 3PL PA-10 εἰμί PA-10
010222 βασιλεύει 3SG PA-1 βασιλεύω PA-1
010303 ἐστί(ν) 3SG PA-10 εἰμί PA-10
010309 λέγειν INF PA-1 λέγω PA-1
010309 ἔχομεν 1PL PA-1/PA-8 ἔχω PA-1
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The columns are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the book/chapter/verse reference&lt;/li&gt;
&lt;li&gt;the normalized form&lt;/li&gt;
&lt;li&gt;the morphosyntactic properties&lt;/li&gt;
&lt;li&gt;the inflectional classes possible without disambiguation&lt;/li&gt;
&lt;li&gt;the lemma&lt;/li&gt;
&lt;li&gt;the disambiguated inflectional class&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can download the entire thing &lt;a href=&#34;https://gist.github.com/jtauber/510a1aa27e2d7e2ccb979fd152ee9e8a/f950582b7f03fec5bf09d155ead2b98734ab636e&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We&#39;ll use this to do our counts in the next post.&lt;/p&gt;
&lt;p&gt;One question comes to mind: are the disambiguated inflectional classes consistent for all the forms of a lexeme (beyond the three exceptions we already saw above)?&lt;/p&gt;
&lt;p&gt;Well, looking at the full output of the script, we find there are a few more in the SBLGNT:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;th rowspan=&#34;3&#34;&gt;ὀμνύω&lt;/th&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;INF&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;ὀμνύναι&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;PA-6a&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ὀμνύειν&lt;/td&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;PA-1&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td colspan=&#34;2&#34;&gt;&lt;i&gt;all other forms&lt;/i&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th rowspan=&#34;4&#34;&gt;δείκνυμι&lt;/th&gt;
    &lt;td&gt;&lt;b&gt;INF&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;δεικνύειν&lt;/td&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;PA-1&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;2SG&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;δεικνύεις&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;1SG&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;δείκνυμι&lt;/td&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;PA-6a&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;3SG&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;δείκνυσι(ν)&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th rowspan=&#34;4&#34;&gt;συνίστημι&lt;/th&gt;
    &lt;td&gt;&lt;b&gt;1PL&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;συνιστάνομεν&lt;/td&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;PA-1&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;INF&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;συνιστάνειν&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;1SG&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;συνίστημι&lt;/td&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;PA-9&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;3SG&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;συνίστησι(ν)&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th rowspan=&#34;4&#34;&gt;ἀφίημι&lt;/th&gt;
    &lt;td&gt;&lt;b&gt;1PL&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;ἀφίομεν&lt;/td&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;PA-1&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;ἀφίουσι(ν)&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;2SG&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;ἀφεῖς&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;PA-2&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td colspan=&#34;2&#34;&gt;&lt;i&gt;all other forms&lt;/i&gt;&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;PA-7&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;th rowspan=&#34;4&#34;&gt;συνίημι&lt;/th&gt;
    &lt;td&gt;&lt;b&gt;INF&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;συνιέναι&lt;/td&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;PA-7&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;2PL&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;συνίετε&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td rowspan=&#34;2&#34;&gt;&lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
    &lt;td&gt;συνίουσι(ν)&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;PA-1&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;συνιᾶσι(ν)&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;PA-9&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;In each case we have an originally athematic verb occasionally acting like it&#39;s thematic (and, in the case of ὀμνύω even the lemma is written as if it was thematic). We WILL have more to say about this in a few posts but we&#39;ve now done enough that we can count how many times each inflectional class appears in the SBLGNT and how many different lexemes follow each inflectional class. We&#39;ll do that in the very next post.&lt;/p&gt;
&lt;p&gt;There is still another thing worth checking: is the value of X in our paradigm patterns consistent across a lexeme too? Yes it is, accent aside, if you only compare within the same inflectional class. The X for the δείκνυμι cells in &lt;strong&gt;PA-6a&lt;/strong&gt; is always δείκν, for example, but the &lt;strong&gt;PA-1&lt;/strong&gt; cases have X = δεικνύ.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: I just discovered a mis-disambiguated παριστάνετε that needs to be special-cased as a &lt;strong&gt;PA-1&lt;/strong&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part sixteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 15</title>
    <link href="https://jktauber.com/2017/09/05/tour-greek-morphology-part-15/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 15"/>
    <published>2017-09-05</published>
    <updated>2017-09-05</updated>
    <id>https://jktauber.com/2017/09/05/tour-greek-morphology-part-15</id>
    <content type="html" xml:base="https://jktauber.com/2017/09/05/tour-greek-morphology-part-15/">&lt;p&gt;Part fifteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the previous two posts in this series (&lt;a href=&#34;/2017/08/26/tour-greek-morphology-part-13/&#34;&gt;part 13&lt;/a&gt; and &lt;a href=&#34;/2017/08/29/tour-greek-morphology-part-14/&#34;&gt;part 14&lt;/a&gt;) we summarized the paradigms we&#39;ve seen so far for the present infinitive and indicative both in the active and middle.&lt;/p&gt;
&lt;p&gt;Do these paradigms cover all the forms in the Greek New Testament? Which paradigms are more common? Which are productive? We&#39;ll explore these questions in the next few posts.&lt;/p&gt;
&lt;p&gt;Let&#39;s start with the active forms.&lt;/p&gt;
&lt;p&gt;The first test is whether every present active infinitive and indicative verb in the MorphGNT SBLGNT matches with one of the patterns we&#39;ve discussed GIVEN ITS MORPHOSYNTACTIC PROPERTY SET. We want to test, for example, whether every verb tagged as &lt;code&gt;-PAN----&lt;/code&gt; matches one of Xειν, Xεῖν, Xοῦν, Xᾶν, Xῆν, Xύναι, Xέναι, Xόναι, Xάναι, or εἶναι. Or whether every verb tagged as &lt;code&gt;2PAI-S--&lt;/code&gt; matches one of Xεις, Xεῖς, Xοῖς, Xᾷς, Xῇς, Xυς, Xης, Xως, Xης, or εἶ.&lt;/p&gt;
&lt;p&gt;Running a short Python script over the MorphGNT, it turns out there are 14 forms in 69 instances that do NOT match.&lt;/p&gt;
&lt;p&gt;Three of these forms are φημί. The issue here is that φημί is enclitic in the indicative and so, even though it otherwise follows a &lt;strong&gt;PA-9&lt;/strong&gt; paradigm, the accentuation doesn&#39;t match. If we want to capture the enclitic nature of φημί in its inflection class, we&#39;ll need to create a variant of &lt;strong&gt;PA-9&lt;/strong&gt; that is enclitic.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA-9&lt;/th&gt;
&lt;th&gt;PA-9-ENCLITIC&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;Xάναι&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xάναι&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xημι&lt;/td&gt;
&lt;td&gt;Xημί&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xής&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xησι(ν)&lt;/td&gt;
&lt;td&gt;Xησί(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xαμεν&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xαμέν&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xατε&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xατέ&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xᾶσι(ν)&lt;/td&gt;
&lt;td&gt;Xασί(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &lt;strong&gt;2SG&lt;/strong&gt; appears more frequently as φῄς in Classical Greek but neither form appears in the SBLGNT so we&#39;ll put that issue aside for now.&lt;/p&gt;
&lt;p&gt;Another eight of these forms are compounds of the copula and so have different accentuation and breathing (but are otherwise identical to &lt;strong&gt;PA-10&lt;/strong&gt;).&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA-10&lt;/th&gt;
&lt;th&gt;PA-10-COMPOUND&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;εἶναι&lt;/td&gt;
&lt;td&gt;Xεῖναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;εἰμί&lt;/td&gt;
&lt;td&gt;Xειμι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;εἶ&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;ἐστί(ν)&lt;/td&gt;
&lt;td&gt;Xεστι(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;ἐσμέν&lt;/td&gt;
&lt;td&gt;Xεσμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;ἐστέ&lt;/td&gt;
&lt;td&gt;Xεστε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;εἰσί(ν)&lt;/td&gt;
&lt;td&gt;Xεισι(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The only additional variation here is εἰσίασιν in Hebrews 9.6 but this is not, in fact, derived from εἰς + εἰμί but rather εἰς + εἶμι. Let&#39;s create a new paradigm for εἶμι even though it doesn&#39;t appear in the the SBLGNT just so we can derive a paradigm for the compound case from it.&lt;/p&gt;
&lt;p&gt;Here &lt;strong&gt;PA-11&lt;/strong&gt; and &lt;strong&gt;PA-11-COMPOUND&lt;/strong&gt; are shown alongside &lt;strong&gt;PA-10&lt;/strong&gt; for comparison (note the italic forms don&#39;t appear in the SBLGNT):&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA-10&lt;/th&gt;
&lt;th&gt;PA-11&lt;/th&gt;
&lt;th&gt;PA-11-COMPOUND&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;εἶναι&lt;/td&gt;
&lt;td&gt;&lt;i&gt;ἰέναι&lt;/i&gt;&lt;/td&gt;
&lt;td&gt;Xιέναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;εἰμί&lt;/td&gt;
&lt;td&gt;&lt;i&gt;εἶμι&lt;/i&gt;&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xειμι&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;εἶ&lt;/td&gt;
&lt;td&gt;&lt;i&gt;εἶ&lt;/i&gt;&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xει&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;ἐστί(ν)&lt;/td&gt;
&lt;td&gt;&lt;i&gt;εἶσι(ν)&lt;/i&gt;&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xεισι(ν)&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;ἐσμέν&lt;/td&gt;
&lt;td&gt;&lt;i&gt;ἴμεν&lt;/i&gt;&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xιμεν&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;ἐστέ&lt;/td&gt;
&lt;td&gt;&lt;i&gt;ἴτε&lt;/i&gt;&lt;/td&gt;
&lt;td&gt;&lt;i&gt;Xιτε&lt;/i&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;εἰσί(ν)&lt;/td&gt;
&lt;td&gt;&lt;i&gt;ἴασι(ν)&lt;/i&gt;&lt;/td&gt;
&lt;td&gt;Xίασι(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;PA-11&lt;/strong&gt; and &lt;strong&gt;PA-11-COMPOUND&lt;/strong&gt; are very similar to &lt;strong&gt;PA-6a&lt;/strong&gt; through &lt;strong&gt;PA-9&lt;/strong&gt; except with ει/ι instead of υ/υ, η/ε, ω/ο, η/α. The &lt;strong&gt;INF&lt;/strong&gt; being ιε is a little unexpected but outside the scope of the current discussion as we really are just wanting to capture the &lt;strong&gt;3PL&lt;/strong&gt; of &lt;strong&gt;PA-11-COMPOUND&lt;/strong&gt; for now.&lt;/p&gt;
&lt;p&gt;Note that εἰσιέναι in Acts 3.3 is also from εἰς + εἶμι but this slipped us by because we have a Xέναι pattern already. Similarly, we have ἐξιέναι in Acts 20.7 and 27.43. With the addition of &lt;strong&gt;PA-11-COMPOUND&lt;/strong&gt; we now have a slight ambiguity with &lt;strong&gt;PA-7&lt;/strong&gt; (in the &lt;strong&gt;INF&lt;/strong&gt;) and &lt;strong&gt;PA-10-COMPOUND&lt;/strong&gt; (in the &lt;strong&gt;1SG&lt;/strong&gt; and &lt;strong&gt;2SG&lt;/strong&gt;). This isn&#39;t a problem at the moment but will come up again (as will other ambiguities) in the next post.&lt;/p&gt;
&lt;p&gt;Adding these paradigm variants covers 12 of our originally non-matching forms. The remaining two are the impersonal χρή and ἔνι which represent fossilized phrases with the copula elided. For our stats we&#39;ll ignore them.&lt;/p&gt;
&lt;p&gt;In the next post, we&#39;ll see if we can categorize the lexemes in the SBLGNT into inflection classes based on these paradigms and therefore be able to study how frequent they are from both a type and token perspective.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part fifteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">More Vocabulary Statistics</title>
    <link href="https://jktauber.com/2017/09/02/more-vocabulary-statistics/" rel="alternate" type="text/html" title="More Vocabulary Statistics"/>
    <published>2017-09-02</published>
    <updated>2017-09-02</updated>
    <id>https://jktauber.com/2017/09/02/more-vocabulary-statistics</id>
    <content type="html" xml:base="https://jktauber.com/2017/09/02/more-vocabulary-statistics/">&lt;p&gt;With a boost in numbers on &lt;a href=&#34;http://vocab.oxlos.org&#34;&gt;vocab.oxlos.org&lt;/a&gt;, this post looks at some slightly more detailed statistics from the first activity.&lt;/p&gt;
&lt;p&gt;Just 5 days ago there were &lt;strong&gt;82&lt;/strong&gt; sign ups with &lt;strong&gt;52&lt;/strong&gt; people having completed the first activity. Now there have been a total of &lt;strong&gt;116&lt;/strong&gt; signups and &lt;strong&gt;79&lt;/strong&gt; people have done at least the first activity (with &lt;strong&gt;44&lt;/strong&gt; having done more than one). Thank you very much everyone!&lt;/p&gt;
&lt;p&gt;In my &lt;a href=&#34;/2017/08/29/some-initial-vocabulary-statistics/&#34;&gt;last post&lt;/a&gt; we looked at mean item difficulty (what proportion of people get an item correct) by frequency bucket.&lt;/p&gt;
&lt;p&gt;We saw that the coarse frequency buckets had an okay correlation with item difficulty but not great. We’ll explore that a little more in the near future but in this post I want to introduce another dimension: the ability of the person being asked the item.&lt;/p&gt;
&lt;p&gt;I should note that in psychometrics (and in item response theory in particular, which we’ll be getting to) the term &#34;ability&#34; is used in a specific sense of the measurement we’re trying to take of the person (with no assumption of whether it’s innate or even desirable). It’s just the person-specific construct we’re trying to measure.&lt;/p&gt;
&lt;p&gt;As an initial proxy for this &#34;ability&#34; in the context of the first activity on the site, I’ve used the total percentage of items in that activity answered correctly by a given person. This is just the raw percentage of items answered correctly, not quite the same as the estimate of NT vocabulary coverage shown on the site. This raw percentage is then used to group people into buckets (just in the context of the first activity for now).&lt;/p&gt;
&lt;p&gt;Now we can tabulate item frequency buckets vs person ability buckets with the following result:&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;img src=&#34;/images/5_buckets.png&#34; width=&#34;100%&#34;&gt;
&lt;/div&gt;

&lt;p&gt;First off, you can see we’re still somewhat lacking in numbers of people of beginning-intermediate ability.&lt;/p&gt;
&lt;p&gt;But importantly, you can see how mean item difficulty (the number in each cell) varies by ability bucket (the column). We’ve already seen that mean item difficulty isn’t a great predicator of item frequency bucket. Splitting out different abilities like we do above makes discrimination easier in some cases. But the important thing to note in the table above is that the mean item difficulty WITHIN a frequency bucket (row) is a good indicator of a person’s overall ability bucket.&lt;/p&gt;
&lt;p&gt;This is less the case in the bucket for the most frequent items (the row labeled &lt;strong&gt;1&lt;/strong&gt;), which makes ability buckets 20% and above difficult to discriminate. Similarly, the less frequent item buckets aren’t as good at discriminating between the lower ability buckets. This is what we would expect.&lt;/p&gt;
&lt;p&gt;But overall, frequency buckets &lt;strong&gt;2&lt;/strong&gt; through &lt;strong&gt;5&lt;/strong&gt; (and especially &lt;strong&gt;3&lt;/strong&gt; and &lt;strong&gt;4&lt;/strong&gt;) do an excellent job of discriminating each of the ability buckets above 20%. &lt;strong&gt;5&lt;/strong&gt; seems particularly well suited for each of the buckets at 40% ability and above and &lt;strong&gt;1&lt;/strong&gt; only really between the 0–20% bucket and the rest.&lt;/p&gt;
&lt;p&gt;I suspect it’s going to be interesting to have more fine-grained item frequencies but even MORE interesting to put aside frequency all together and bucket them by overall difficulty. I’ll do that in a subsequent post once I’ve done the analysis. At some point I’ll also look at individual items and their ability to discriminate ability.&lt;/p&gt;
&lt;p&gt;For now, though, I did want to share a finer-grained bucketing of ability, with ten buckets instead of five:&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;img src=&#34;/images/10_buckets.png&#34; width=&#34;100%&#34;&gt;
&lt;/div&gt;

&lt;p&gt;The lack of people below the 50% ability mark makes this a little less useful and there are adjacent ability buckets that cease to be discriminating at this level of granularity.&lt;/p&gt;
&lt;p&gt;But the important pattern is still there, assuming for now frequency is a proxy for difficulty: if an item is easy, it can’t discriminate people of higher ability, although may be great at discriminating those of lower ability; and if an item is hard, it can’t discriminate people of lower ability, although may be great at discriminating those of higher ability.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">With a boost in numbers on &lt;a href=&#34;http://vocab.oxlos.org&#34;&gt;vocab.oxlos.org&lt;/a&gt;, this post looks at some slightly more detailed statistics from the first activity.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 14</title>
    <link href="https://jktauber.com/2017/08/29/tour-greek-morphology-part-14/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 14"/>
    <published>2017-08-29</published>
    <updated>2017-08-29</updated>
    <id>https://jktauber.com/2017/08/29/tour-greek-morphology-part-14</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/29/tour-greek-morphology-part-14/">&lt;p&gt;Part fourteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Now we summarize our middle distinguishers. As we did for &lt;strong&gt;PA-6a&lt;/strong&gt;, we&#39;ll include the upsilon for &lt;strong&gt;PM-6a&lt;/strong&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PM-1&lt;/th&gt;
&lt;th&gt;PM-2&lt;/th&gt;
&lt;th&gt;PM-3&lt;/th&gt;
&lt;th&gt;PM-4&lt;/th&gt;
&lt;th&gt;PM-5&lt;/th&gt;
&lt;th&gt;PM-6a&lt;/th&gt;
&lt;th&gt;PM-7&lt;/th&gt;
&lt;th&gt;PM-8&lt;/th&gt;
&lt;th&gt;PM-9&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;Xεσθαι&lt;/td&gt;
&lt;td&gt;Xεῖσθαι&lt;/td&gt;
&lt;td&gt;Xοῦσθαι&lt;/td&gt;
&lt;td&gt;Xᾶσθαι&lt;/td&gt;
&lt;td&gt;Xῆσθαι&lt;/td&gt;
&lt;td&gt;Xυσθαι&lt;/td&gt;
&lt;td&gt;Xεσθαι&lt;/td&gt;
&lt;td&gt;Xοσθαι&lt;/td&gt;
&lt;td&gt;Xασθαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xομαι&lt;/td&gt;
&lt;td&gt;Xοῦμαι&lt;/td&gt;
&lt;td&gt;Xοῦμαι&lt;/td&gt;
&lt;td&gt;Xῶμαι&lt;/td&gt;
&lt;td&gt;Xῶμαι&lt;/td&gt;
&lt;td&gt;Xυμαι&lt;/td&gt;
&lt;td&gt;Xεμαι&lt;/td&gt;
&lt;td&gt;Xομαι&lt;/td&gt;
&lt;td&gt;Xαμαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xῃ or Xει&lt;/td&gt;
&lt;td&gt;Xῇ or Xεῖ&lt;/td&gt;
&lt;td&gt;Xοῖ&lt;/td&gt;
&lt;td&gt;Xᾷ&lt;/td&gt;
&lt;td&gt;Xῇ&lt;/td&gt;
&lt;td&gt;Xυσαι&lt;/td&gt;
&lt;td&gt;Xεσαι&lt;/td&gt;
&lt;td&gt;Xοσαι&lt;/td&gt;
&lt;td&gt;Xασαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xεται&lt;/td&gt;
&lt;td&gt;Xεῖται&lt;/td&gt;
&lt;td&gt;Xοῦται&lt;/td&gt;
&lt;td&gt;Xᾶται&lt;/td&gt;
&lt;td&gt;Xῆται&lt;/td&gt;
&lt;td&gt;Xυται&lt;/td&gt;
&lt;td&gt;Xεται&lt;/td&gt;
&lt;td&gt;Xοται&lt;/td&gt;
&lt;td&gt;Xαται&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;td&gt;Xύμεθα&lt;/td&gt;
&lt;td&gt;Xέμεθα&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xάμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xεῖσθε&lt;/td&gt;
&lt;td&gt;Xοῦσθε&lt;/td&gt;
&lt;td&gt;Xᾶσθε&lt;/td&gt;
&lt;td&gt;Xῆσθε&lt;/td&gt;
&lt;td&gt;Xυσθε&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xοσθε&lt;/td&gt;
&lt;td&gt;Xασθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xονται&lt;/td&gt;
&lt;td&gt;Xοῦνται&lt;/td&gt;
&lt;td&gt;Xοῦνται&lt;/td&gt;
&lt;td&gt;Xῶνται&lt;/td&gt;
&lt;td&gt;Xῶνται&lt;/td&gt;
&lt;td&gt;Xυνται&lt;/td&gt;
&lt;td&gt;Xενται&lt;/td&gt;
&lt;td&gt;Xονται&lt;/td&gt;
&lt;td&gt;Xανται&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;and if we capture the common elements in each row:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PM-1&lt;/th&gt;
&lt;th&gt;PM-2&lt;/th&gt;
&lt;th&gt;PM-3&lt;/th&gt;
&lt;th&gt;PM-4&lt;/th&gt;
&lt;th&gt;PM-5&lt;/th&gt;
&lt;th&gt;PM-6a&lt;/th&gt;
&lt;th&gt;PM-7&lt;/th&gt;
&lt;th&gt;PM-8&lt;/th&gt;
&lt;th&gt;PM-9&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-σαι&lt;/td&gt;
&lt;td&gt;-σαι&lt;/td&gt;
&lt;td&gt;-σαι&lt;/td&gt;
&lt;td&gt;-σαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Notice that, other than the contraction happening in &lt;strong&gt;2SG&lt;/strong&gt; obscuring the historical σαι, and unlike the active, there is no difference between the thematic and athematic endings.&lt;/p&gt;
&lt;p&gt;That does mean, however, that the &lt;strong&gt;INF&lt;/strong&gt; is no longer completely predictive of the other forms and, in fact no cells are (&lt;strong&gt;2SG&lt;/strong&gt; getting close but failing because of the -ῇ ambiguity).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;INF&lt;/strong&gt;, &lt;strong&gt;3SG&lt;/strong&gt;, and &lt;strong&gt;2PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;PM-1&lt;/strong&gt;, &lt;strong&gt;PM-7&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, and &lt;strong&gt;3PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;PM-1&lt;/strong&gt;, &lt;strong&gt;PM-8&lt;/strong&gt;}, the set {&lt;strong&gt;PM-2&lt;/strong&gt;, &lt;strong&gt;PM-3&lt;/strong&gt;}, or the set {&lt;strong&gt;PM-4&lt;/strong&gt;, &lt;strong&gt;PM-5&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;2SG&lt;/strong&gt; (at least if ῇ) can&#39;t distinguish within the set {&lt;strong&gt;PM-2&lt;/strong&gt;, &lt;strong&gt;PM-5&lt;/strong&gt;}&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That means, even if you had the &lt;strong&gt;INF&lt;/strong&gt;, &lt;strong&gt;3SG&lt;/strong&gt;, AND &lt;strong&gt;2PL&lt;/strong&gt; of a word, you might not be able to predict its other forms (but if you had a single one of those other forms, all the rest would be predictable). And if you had the &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, and/or &lt;strong&gt;3PL&lt;/strong&gt; of a word, you might not be able to predict its other forms (but again, if you had a single one of those other forms, all the rest would be predictable).&lt;/p&gt;
&lt;p&gt;This mirrors the ambiguous categories we&#39;ve already seen.&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;PM-&lt;/b&gt;{&lt;b&gt;1&lt;/b&gt;, &lt;b&gt;7&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;ε in &lt;b&gt;INF&lt;/b&gt;, &lt;b&gt;3SG&lt;/b&gt;, and &lt;b&gt;2PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;PM-&lt;/b&gt;{&lt;b&gt;1&lt;/b&gt;, &lt;b&gt;8&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;ο in &lt;b&gt;1SG&lt;/b&gt;, &lt;b&gt;1PL&lt;/b&gt;, and &lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;PM-&lt;/b&gt;{&lt;b&gt;2&lt;/b&gt;, &lt;b&gt;3&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;οῦ in &lt;b&gt;1PL&lt;/b&gt; and &lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;PM-&lt;/b&gt;{&lt;b&gt;4&lt;/b&gt;, &lt;b&gt;5&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;ῶ in &lt;b&gt;1PL&lt;/b&gt; and &lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Plus:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;PM-&lt;/b&gt;{&lt;b&gt;2&lt;/b&gt;, &lt;b&gt;5&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;ῇ ending in &lt;b&gt;2SG&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Also, without accentuation, &lt;strong&gt;PM-4&lt;/strong&gt; and &lt;strong&gt;PM-9&lt;/strong&gt; would be indistinguishable in &lt;strong&gt;INF&lt;/strong&gt;, &lt;strong&gt;3SG&lt;/strong&gt;, and &lt;strong&gt;2PL&lt;/strong&gt;. And, similarly, &lt;strong&gt;PM-1&lt;/strong&gt; and &lt;strong&gt;PM-2&lt;/strong&gt; in &lt;strong&gt;2SG&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In the next part, we&#39;ll look at the MorphGNT to see whether the distinguishers here and in &lt;a href=&#34;/2017/08/26/tour-greek-morphology-part-13/&#34;&gt;part 13&lt;/a&gt; fully cover all present infinitive and indicative verbs in the SBLGNT. We&#39;ll also look at some frequency data. How (relatively) common are each of the paradigms we&#39;ve identified? Which seem to be productive and which not? We&#39;ll also briefly touch on words that change inflectional class (and hence paradigm) and what role ambiguous forms might play in this.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part fourteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Some Initial Vocabulary Statistics</title>
    <link href="https://jktauber.com/2017/08/29/some-initial-vocabulary-statistics/" rel="alternate" type="text/html" title="Some Initial Vocabulary Statistics"/>
    <published>2017-08-29</published>
    <updated>2017-08-29</updated>
    <id>https://jktauber.com/2017/08/29/some-initial-vocabulary-statistics</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/29/some-initial-vocabulary-statistics/">&lt;p&gt;Here are some very preliminary statistics from the Greek Vocab site’s first month.&lt;/p&gt;
&lt;p&gt;So far &lt;strong&gt;82&lt;/strong&gt; people have signed up to &lt;a href=&#34;http://vocab.oxlos.org/&#34;&gt;http://vocab.oxlos.org/&lt;/a&gt; and &lt;strong&gt;52&lt;/strong&gt; have completed at least the first activity, a common noun receptive vocabulary leveling test based on a test form developed (for English) by Paul Nation.&lt;/p&gt;
&lt;p&gt;Recall from my &lt;a href=&#34;/2017/07/29/new-site-vocabulary-experiments/&#34;&gt;initial post&lt;/a&gt; on the site, that vocabulary items in that activity are classified into one of five buckets based on how many times they occur in the Greek New Testament.&lt;/p&gt;
&lt;p&gt;Here are the mean results (with standard error) for each bucket for the first activity (N=52):&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;bucket&lt;/th&gt;&lt;th&gt;occurences&lt;/th&gt;&lt;th&gt;mean  ± std err&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;32 or more times&lt;/td&gt;&lt;td&gt;0.966 ± 0.008&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;16 to 31 times&lt;/td&gt;&lt;td&gt;0.837 ± 0.028&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;4 to 15 times&lt;/td&gt;&lt;td&gt;0.667 ± 0.041&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;2 or 3 times&lt;/td&gt;&lt;td&gt;0.556 ± 0.049&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;1 time&lt;/td&gt;&lt;td&gt;0.582 ± 0.047&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;The first four buckets get increasingly more difficult, as one would expect. But notice the buckets 4 and 5 are indistinguishable within the standard error of the two means.&lt;/p&gt;
&lt;p&gt;Here are the results of the next three activities of the same type.&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;bucket&lt;/th&gt;&lt;th&gt;GNT Nouns 2&lt;/th&gt;&lt;th&gt;GNT Nouns 3&lt;/th&gt;&lt;th&gt;GNT Nouns 4&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;td&gt;N=30&lt;/td&gt;&lt;td&gt;N=19&lt;/td&gt;&lt;td&gt;N=15&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;0.985 ±   0.004&lt;/td&gt;&lt;td&gt;0.991 ± 0.005&lt;/td&gt;&lt;td&gt;0.985 ± 0.007&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;0.894 ±   0.020&lt;/td&gt;&lt;td&gt;0.901 ± 0.021&lt;/td&gt;&lt;td&gt;0.930 ± 0.018&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;0.631 ±   0.046&lt;/td&gt;&lt;td&gt;0.661 ± 0.039&lt;/td&gt;&lt;td&gt;0.689 ± 0.051&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;0.602 ±   0.060&lt;/td&gt;&lt;td&gt;0.570 ± 0.067&lt;/td&gt;&lt;td&gt;0.574 ± 0.059&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;0.450 ±   0.048&lt;/td&gt;&lt;td&gt;0.556 ± 0.064&lt;/td&gt;&lt;td&gt;0.611 ± 0.050&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;GNT Nouns 2&lt;/strong&gt; actually does successfully separate buckets 4 and 5 (apparently the hapax legomena in that test were harder) but it doesn’t do a great job distinguishing buckets 3 and 4. &lt;strong&gt;GNT Nouns 3&lt;/strong&gt; fails to distinguish buckets 4 and 5 and only barely separates 3 and 4. &lt;strong&gt;GNT Nouns 4&lt;/strong&gt; likewise doesn’t really distinguish buckets 4 and 5 and only barely separates 3 and 4.&lt;/p&gt;
&lt;p&gt;It should be noted that the ability level of the average person doing an activity increases with each activity. This isn’t clear from the data presented here but is from other data. This is likely because a person who has done reasonably well on one activity is more likely to continue to do more activities.&lt;/p&gt;
&lt;p&gt;I COULD mitigate this problem by only including results for earlier activities from people who have completed all four. But before I do that, I’d actually like to just see more people do all four activities.&lt;/p&gt;
&lt;p&gt;Furthermore, the vast majority of people doing these activities are scoring above 50% and, in fact, no one scoring below 40% has attempted activities beyond the first. &lt;strong&gt;I NEED MORE BEGINNER-INTERMEDIATE LEVEL PEOPLE&lt;/strong&gt; to do all four tests! They will better discriminate mid-to-hard difficulty items (more on that concept later).&lt;/p&gt;
&lt;p&gt;But preliminary indications are that I haven’t quite got the buckets right yet. Fortunately, I can re-run analyses with different bucketing even if the distribution of items chosen for the tests are based on the existing bucketing scheme.&lt;/p&gt;
&lt;p&gt;I’ll continue to blog more statistics over time. Some topics I’d like to explore include inter-test reliability, G-theory, ANOVA, and IRT modeling.&lt;/p&gt;
&lt;p&gt;Thank you to everyone who is contributing to this. Please spread the word!&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Here are some very preliminary statistics from the Greek Vocab site’s first month.</summary>
  </entry><entry>
    <title type="html">Greek Letter Frequencies</title>
    <link href="https://jktauber.com/2017/08/27/greek-letter-frequencies/" rel="alternate" type="text/html" title="Greek Letter Frequencies"/>
    <published>2017-08-27</published>
    <updated>2017-08-27</updated>
    <id>https://jktauber.com/2017/08/27/greek-letter-frequencies</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/27/greek-letter-frequencies/">&lt;p&gt;I recently saw a nice visualisation of English letter bigram frequencies and decided to replicate it with Greek New Testament data.&lt;/p&gt;
&lt;p&gt;You can see the English original in &lt;a href=&#34;http://allthingslinguistic.com/post/164611717478/datarep-letter-and-next-letter-frequencies-in&#34;&gt;this post&lt;/a&gt; on All Things Linguistic. That&#39;s not where I originally saw it, though. I think I saw a link on Twitter to a Reddit post.&lt;/p&gt;
&lt;p&gt;I wrote a quick Python script to generate the same style of visualisation based on word types (not tokens) in the SBLGNT after stripping accents and folding to lowercase (but keeping the apostrophe used to mark elision). This is the result:&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;img src=&#34;/images/greek-letter-frequencies.png&#34; width=&#34;100%&#34;&gt;
&lt;/div&gt;

&lt;p&gt;The intensity of red in the left column indicates the relative frequency of that letter overall. Each row then indicates (via ordering and the intensity of blue) the relative frequencies of what letter follows that red letter. The superscript then indicates the single most likely letter to follow that sequence of two letters. So it shows all unigram frequencies, all bigram frequencies, and the most common trigram for each bigram.&lt;/p&gt;
&lt;p&gt;I also used the same bigram and trigram data to generate pseudowords, much like the English original did. At the time, I only tweeted about this second part.&lt;/p&gt;
&lt;blockquote class=&#34;twitter-tweet&#34; data-lang=&#34;en&#34;&gt;&lt;p lang=&#34;und&#34; dir=&#34;ltr&#34;&gt;Trigram-based generation of Greek-like words seems promising: ὀκρός θρωτοί δελθομοῦς ἐδωσῖνα ἐπιδάς εὑόν εἰπῆς ἐνησόφος πόδου δόξηλθον μετέ&lt;/p&gt;&amp;mdash; James Tauber (@jtauber) &lt;a href=&#34;https://twitter.com/jtauber/status/894510737552486400&#34;&gt;August 7, 2017&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;//platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;

&lt;p&gt;Patrick Burns asked me for the pseudword generation code so I extracted it, cleaned it up a bit and posted it to a gist &lt;a href=&#34;https://gist.github.com/jtauber/71c6ab6a7bfaf42cffe64d74b69e7a2a&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I never got around to posting my letter frequency visualisation, but Seumas Macdonald (not knowing I&#39;d already done the work) pointed me to the All Things Linguistic blog post and asked about the possibility of doing the same for Greek. It was enough of a nudge to get this blog post written.&lt;/p&gt;
&lt;p&gt;Thanks Seumas and Patrick!&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I recently saw a nice visualisation of English letter bigram frequencies and decided to replicate it with Greek New Testament data.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 13</title>
    <link href="https://jktauber.com/2017/08/26/tour-greek-morphology-part-13/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 13"/>
    <published>2017-08-26</published>
    <updated>2017-08-26</updated>
    <id>https://jktauber.com/2017/08/26/tour-greek-morphology-part-13</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/26/tour-greek-morphology-part-13/">&lt;p&gt;Part thirteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Let&#39;s summarize all 10 active distinguisher paradigms we&#39;ve seen so far (this will probably only layout properly if your browser is wide):&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA-1&lt;/th&gt;
&lt;th&gt;PA-2&lt;/th&gt;
&lt;th&gt;PA-3&lt;/th&gt;
&lt;th&gt;PA-4&lt;/th&gt;
&lt;th&gt;PA-5&lt;/th&gt;
&lt;th&gt;PA-6&lt;/th&gt;
&lt;th&gt;PA-7&lt;/th&gt;
&lt;th&gt;PA-8&lt;/th&gt;
&lt;th&gt;PA-9&lt;/th&gt;
&lt;th&gt;PA-10&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;Xειν&lt;/td&gt;
&lt;td&gt;Xεῖν&lt;/td&gt;
&lt;td&gt;Xοῦν&lt;/td&gt;
&lt;td&gt;Xᾶν&lt;/td&gt;
&lt;td&gt;Xῆν&lt;/td&gt;
&lt;td&gt;Xναι&lt;/td&gt;
&lt;td&gt;Xέναι&lt;/td&gt;
&lt;td&gt;Xόναι&lt;/td&gt;
&lt;td&gt;Xάναι&lt;/td&gt;
&lt;td&gt;εἶναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xω&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xμι&lt;/td&gt;
&lt;td&gt;Xημι&lt;/td&gt;
&lt;td&gt;Xωμι&lt;/td&gt;
&lt;td&gt;Xημι&lt;/td&gt;
&lt;td&gt;εἰμί&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xεῖς&lt;/td&gt;
&lt;td&gt;Xοῖς&lt;/td&gt;
&lt;td&gt;Xᾷς&lt;/td&gt;
&lt;td&gt;Xῇς&lt;/td&gt;
&lt;td&gt;Xς&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;td&gt;Xως&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;td&gt;εἶ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xεῖ&lt;/td&gt;
&lt;td&gt;Xοῖ&lt;/td&gt;
&lt;td&gt;Xᾷ&lt;/td&gt;
&lt;td&gt;Xῇ&lt;/td&gt;
&lt;td&gt;Xσι(ν)&lt;/td&gt;
&lt;td&gt;Xησι(ν)&lt;/td&gt;
&lt;td&gt;Xωσι(ν)&lt;/td&gt;
&lt;td&gt;Xησι(ν)&lt;/td&gt;
&lt;td&gt;ἐστί(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;td&gt;Xμεν&lt;/td&gt;
&lt;td&gt;Xεμεν&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xαμεν&lt;/td&gt;
&lt;td&gt;ἐσμέν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xεῖτε&lt;/td&gt;
&lt;td&gt;Xοῦτε&lt;/td&gt;
&lt;td&gt;Xᾶτε&lt;/td&gt;
&lt;td&gt;Xῆτε&lt;/td&gt;
&lt;td&gt;Xτε&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xοτε&lt;/td&gt;
&lt;td&gt;Xατε&lt;/td&gt;
&lt;td&gt;ἐστέ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xουσι(ν)&lt;/td&gt;
&lt;td&gt;Xοῦσι(ν)&lt;/td&gt;
&lt;td&gt;Xοῦσι(ν)&lt;/td&gt;
&lt;td&gt;Xῶσι(ν)&lt;/td&gt;
&lt;td&gt;Xῶσι(ν)&lt;/td&gt;
&lt;td&gt;Xασι(ν)&lt;/td&gt;
&lt;td&gt;Xέασι(ν)&lt;/td&gt;
&lt;td&gt;Xόασι(ν)&lt;/td&gt;
&lt;td&gt;Xᾶσι(ν)&lt;/td&gt;
&lt;td&gt;εἰσί(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;As we&#39;ve already noted, some cells have identical distinguishers (for example, the ῶ of &lt;strong&gt;PA-2&lt;/strong&gt;, &lt;strong&gt;PA-3&lt;/strong&gt;, &lt;strong&gt;PA-4&lt;/strong&gt; and &lt;strong&gt;PA-5&lt;/strong&gt;). More on that shortly.&lt;/p&gt;
&lt;p&gt;But first note something about &lt;strong&gt;PA-6&lt;/strong&gt;—it subsumes the next three paradigms and, in fact, in the case of &lt;strong&gt;2SG&lt;/strong&gt; subsumes every paradigm except &lt;strong&gt;PA-10&lt;/strong&gt;. In otherwords, a word form from another paradigm technically matches &lt;strong&gt;PA-6&lt;/strong&gt; too. If you go back to &lt;a href=&#34;/2017/08/02/tour-greek-morphology-part-10/&#34;&gt;part 10&lt;/a&gt;, you&#39;ll see that our exemplar for &lt;strong&gt;PA-6&lt;/strong&gt; was δεικνύναι, δείκνυμι, and so on. The &lt;em&gt;only&lt;/em&gt; reason &lt;strong&gt;PA-6&lt;/strong&gt; doesn&#39;t have a vowel like &lt;strong&gt;PA-7&lt;/strong&gt;, &lt;strong&gt;PA-8&lt;/strong&gt;, &lt;strong&gt;PA-9&lt;/strong&gt; is that the vowel is always υ and hence it was dropped out of the distinguisher analysis. But we have no reason at this stage not to supposed the upsilon is an important part of the &lt;strong&gt;PA-6&lt;/strong&gt; paradigm (it just doesn&#39;t distinguish cells &lt;em&gt;within&lt;/em&gt; the paradigm). So I&#39;m going to tentatively put it back for the purposes of comparing &lt;em&gt;across&lt;/em&gt; paradigms. I&#39;ll call this modified distinguisher paradigm &lt;strong&gt;PA-6a&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;In repeating the paradigm of paradigms with this small modification:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA-1&lt;/th&gt;
&lt;th&gt;PA-2&lt;/th&gt;
&lt;th&gt;PA-3&lt;/th&gt;
&lt;th&gt;PA-4&lt;/th&gt;
&lt;th&gt;PA-5&lt;/th&gt;
&lt;th&gt;PA-6a&lt;/th&gt;
&lt;th&gt;PA-7&lt;/th&gt;
&lt;th&gt;PA-8&lt;/th&gt;
&lt;th&gt;PA-9&lt;/th&gt;
&lt;th&gt;PA-10&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;Xειν&lt;/td&gt;
&lt;td&gt;Xεῖν&lt;/td&gt;
&lt;td&gt;Xοῦν&lt;/td&gt;
&lt;td&gt;Xᾶν&lt;/td&gt;
&lt;td&gt;Xῆν&lt;/td&gt;
&lt;td&gt;Xύναι&lt;/td&gt;
&lt;td&gt;Xέναι&lt;/td&gt;
&lt;td&gt;Xόναι&lt;/td&gt;
&lt;td&gt;Xάναι&lt;/td&gt;
&lt;td&gt;εἶναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xω&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xυμι&lt;/td&gt;
&lt;td&gt;Xημι&lt;/td&gt;
&lt;td&gt;Xωμι&lt;/td&gt;
&lt;td&gt;Xημι&lt;/td&gt;
&lt;td&gt;εἰμί&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xεῖς&lt;/td&gt;
&lt;td&gt;Xοῖς&lt;/td&gt;
&lt;td&gt;Xᾷς&lt;/td&gt;
&lt;td&gt;Xῇς&lt;/td&gt;
&lt;td&gt;Xυς&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;td&gt;Xως&lt;/td&gt;
&lt;td&gt;Xης&lt;/td&gt;
&lt;td&gt;εἶ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xεῖ&lt;/td&gt;
&lt;td&gt;Xοῖ&lt;/td&gt;
&lt;td&gt;Xᾷ&lt;/td&gt;
&lt;td&gt;Xῇ&lt;/td&gt;
&lt;td&gt;Xυσι(ν)&lt;/td&gt;
&lt;td&gt;Xησι(ν)&lt;/td&gt;
&lt;td&gt;Xωσι(ν)&lt;/td&gt;
&lt;td&gt;Xησι(ν)&lt;/td&gt;
&lt;td&gt;ἐστί(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;td&gt;Xυμεν&lt;/td&gt;
&lt;td&gt;Xεμεν&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xαμεν&lt;/td&gt;
&lt;td&gt;ἐσμέν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xεῖτε&lt;/td&gt;
&lt;td&gt;Xοῦτε&lt;/td&gt;
&lt;td&gt;Xᾶτε&lt;/td&gt;
&lt;td&gt;Xῆτε&lt;/td&gt;
&lt;td&gt;Xυτε&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xοτε&lt;/td&gt;
&lt;td&gt;Xατε&lt;/td&gt;
&lt;td&gt;ἐστέ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xουσι(ν)&lt;/td&gt;
&lt;td&gt;Xοῦσι(ν)&lt;/td&gt;
&lt;td&gt;Xοῦσι(ν)&lt;/td&gt;
&lt;td&gt;Xῶσι(ν)&lt;/td&gt;
&lt;td&gt;Xῶσι(ν)&lt;/td&gt;
&lt;td&gt;Xύασι(ν)&lt;/td&gt;
&lt;td&gt;Xέασι(ν)&lt;/td&gt;
&lt;td&gt;Xόασι(ν)&lt;/td&gt;
&lt;td&gt;Xᾶσι(ν)&lt;/td&gt;
&lt;td&gt;εἰσί(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Now let&#39;s capture the common elements in the rows:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA-1&lt;/th&gt;
&lt;th&gt;PA-2&lt;/th&gt;
&lt;th&gt;PA-3&lt;/th&gt;
&lt;th&gt;PA-4&lt;/th&gt;
&lt;th&gt;PA-5&lt;/th&gt;
&lt;th&gt;PA-6a&lt;/th&gt;
&lt;th&gt;PA-7&lt;/th&gt;
&lt;th&gt;PA-8&lt;/th&gt;
&lt;th&gt;PA-9&lt;/th&gt;
&lt;th&gt;PA-10&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;-ν&lt;/td&gt;
&lt;td&gt;-ν&lt;/td&gt;
&lt;td&gt;-ν&lt;/td&gt;
&lt;td&gt;-ν&lt;/td&gt;
&lt;td&gt;-ν&lt;/td&gt;
&lt;td&gt;-ναι&lt;/td&gt;
&lt;td&gt;-ναι&lt;/td&gt;
&lt;td&gt;-ναι&lt;/td&gt;
&lt;td&gt;-ναι&lt;/td&gt;
&lt;td&gt;-ναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;-ω&lt;/td&gt;
&lt;td&gt;-ῶ&lt;/td&gt;
&lt;td&gt;-ῶ&lt;/td&gt;
&lt;td&gt;-ῶ&lt;/td&gt;
&lt;td&gt;-ῶ&lt;/td&gt;
&lt;td&gt;-μι&lt;/td&gt;
&lt;td&gt;-μι&lt;/td&gt;
&lt;td&gt;-μι&lt;/td&gt;
&lt;td&gt;-μι&lt;/td&gt;
&lt;td&gt;-μί&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;-{ι}ς&lt;/td&gt;
&lt;td&gt;-{ι}ς&lt;/td&gt;
&lt;td&gt;-{ι}ς&lt;/td&gt;
&lt;td&gt;-{ι}ς&lt;/td&gt;
&lt;td&gt;-{ι}ς&lt;/td&gt;
&lt;td&gt;-ς&lt;/td&gt;
&lt;td&gt;-ς&lt;/td&gt;
&lt;td&gt;-ς&lt;/td&gt;
&lt;td&gt;-ς&lt;/td&gt;
&lt;td&gt;εἶ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;ἐστί(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μέν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-τέ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-ασι(ν)&lt;/td&gt;
&lt;td&gt;-ασι(ν)&lt;/td&gt;
&lt;td&gt;-ασι(ν)&lt;/td&gt;
&lt;td&gt;-ᾶσι(ν)&lt;/td&gt;
&lt;td&gt;-σί(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &lt;strong&gt;INF&lt;/strong&gt;, although coming in two variants, has the property that it gives us enough information to know &lt;strong&gt;every form of the word in the present indicative active&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;No other slots in our paradigms do that.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;1SG&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;PA-2&lt;/strong&gt;, &lt;strong&gt;PA-3&lt;/strong&gt;, &lt;strong&gt;PA-4&lt;/strong&gt;, &lt;strong&gt;PA-5&lt;/strong&gt;} or within the set {&lt;strong&gt;PA-7&lt;/strong&gt;, &lt;strong&gt;PA-9&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;2SG&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;PA-7&lt;/strong&gt;, &lt;strong&gt;PA-9&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;3SG&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;PA-7&lt;/strong&gt;, &lt;strong&gt;PA-9&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;1PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;PA-2&lt;/strong&gt;, &lt;strong&gt;PA-3&lt;/strong&gt;}, the set {&lt;strong&gt;PA-1&lt;/strong&gt;, &lt;strong&gt;PA-8&lt;/strong&gt;},  or the set {&lt;strong&gt;PA-4&lt;/strong&gt;, &lt;strong&gt;PA-5&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;2PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;PA-1&lt;/strong&gt;, &lt;strong&gt;PA-7&lt;/strong&gt;}&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;3PL&lt;/strong&gt; can&#39;t distinguish within the set {&lt;strong&gt;PA-2&lt;/strong&gt;, &lt;strong&gt;PA-3&lt;/strong&gt;} or within the set {&lt;strong&gt;PA-4&lt;/strong&gt;, &lt;strong&gt;PA-5&lt;/strong&gt;}&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Among other things, this is why the &lt;strong&gt;1SG&lt;/strong&gt; isn&#39;t a great choice of &lt;strong&gt;lemma&lt;/strong&gt; (or headword, or citation form) for a lexeme. It&#39;s the reason so many dictionaries and lemmatizations show the circumflex verbs uncontracted (e.g. ποιέω for ποιῶ) even though in many dialects, including the Koine, that&#39;s a nonsense word. Even then, most dictionaries don&#39;t distinguish &lt;strong&gt;PA-7&lt;/strong&gt; from &lt;strong&gt;PA-9&lt;/strong&gt; (τίθημι vs ἵστημι) although admittedly that&#39;s not as important as they aren&#39;t productive.&lt;/p&gt;
&lt;p&gt;In almost all respects, &lt;strong&gt;the present active infinitive is the perfect lemma for the present active forms of a verb&lt;/strong&gt;. Some have argued against the infinitive as lemma as it doesn&#39;t form a clause by itself (although nor do verbs with obligatory complements). A close candidate is the &lt;strong&gt;3SG&lt;/strong&gt;, the benefit of which is how common it is. The main downside is just it doesn&#39;t distinguish &lt;strong&gt;PA-7&lt;/strong&gt; and &lt;strong&gt;PA-9&lt;/strong&gt;. But one could hardly go wrong focusing on the &lt;strong&gt;INF&lt;/strong&gt; and &lt;strong&gt;3SG&lt;/strong&gt; as the forms to most associate with each present active verb.&lt;/p&gt;
&lt;p&gt;It should be noted that even though the &lt;strong&gt;1SG&lt;/strong&gt; is the worst &lt;em&gt;predictively&lt;/em&gt;, it&#39;s completely &lt;em&gt;predictable&lt;/em&gt; from any other forms. Also, despite some ambiguity in the &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt;, they can be predicted from one another. Similarly, all the singulars in the &lt;strong&gt;PA-7&lt;/strong&gt; and &lt;strong&gt;PA-9&lt;/strong&gt; words predict each other.&lt;/p&gt;
&lt;p&gt;Another way of thinking about this is to group our paradigm classes by their shared properties:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td colspan=&#34;3&#34;&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;1&lt;/b&gt;, &lt;b&gt;2&lt;/b&gt;, &lt;b&gt;3&lt;/b&gt;, &lt;b&gt;4&lt;/b&gt;, &lt;b&gt;5&lt;/b&gt;}&lt;/td&gt;
    &lt;td colspan=&#34;3&#34;&gt;&lt;b&gt;INF&lt;/b&gt; ends in -ν, &lt;b&gt;1SG&lt;/b&gt; in -ω/-ῶ&lt;/td&gt;
    &lt;td&gt;thematic or omega verbs&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34; nowrap&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;2&lt;/b&gt;, &lt;b&gt;3&lt;/b&gt;, &lt;b&gt;4&lt;/b&gt;, &lt;b&gt;5&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34;&gt;circumflex throughout endings&lt;/td&gt;
    &lt;td&gt;circumflex or contract verbs&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;2&lt;/b&gt;, &lt;b&gt;3&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;οῦ in &lt;b&gt;1PL&lt;/b&gt; and &lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;4&lt;/b&gt;, &lt;b&gt;5&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;ῶ in &lt;b&gt;1PL&lt;/b&gt; and &lt;b&gt;3PL&lt;/b&gt;&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td colspan=&#34;3&#34; nowrap&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;6a&lt;/b&gt;, &lt;b&gt;7&lt;/b&gt;, &lt;b&gt;8&lt;/b&gt;, &lt;b&gt;9&lt;/b&gt;, &lt;b&gt;10&lt;/b&gt;}&lt;/td&gt;
    &lt;td colspan=&#34;3&#34; nowrap&gt;&lt;b&gt;INF&lt;/b&gt; ends in -ναι, &lt;b&gt;1SG&lt;/b&gt; in -μι&lt;/td&gt;
    &lt;td&gt;athematic or μι verbs&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34; nowrap&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;6a&lt;/b&gt;, &lt;b&gt;7&lt;/b&gt;, &lt;b&gt;8&lt;/b&gt;, &lt;b&gt;9&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td colspan=&#34;2&#34; nowrap&gt;&lt;b&gt;3SG&lt;/b&gt; in -σι(ν), &lt;b&gt;3PL&lt;/b&gt; in -ασι(ν)&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;7&lt;/b&gt;, &lt;b&gt;9&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;&amp;nbsp;&lt;/td&gt;
    &lt;td&gt;η in singulars&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;There are the other cross-cutting categories:&lt;/p&gt;
&lt;table&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;1&lt;/b&gt;, &lt;b&gt;8&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;1PL&lt;/b&gt; ends with ομεν&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;&lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;1&lt;/b&gt;, &lt;b&gt;7&lt;/b&gt;}&lt;/td&gt;
    &lt;td&gt;&lt;b&gt;2PL&lt;/b&gt; ends with ετε&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;If one ignores accentuation, one could conceivably also come up with cross-cutting categories such as &lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;1&lt;/b&gt;,&lt;b&gt;2&lt;/b&gt;} which shares the ει in the &lt;strong&gt;INF&lt;/strong&gt;, &lt;strong&gt;2SG&lt;/strong&gt;, and &lt;strong&gt;3SG&lt;/strong&gt;. Or &lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;4&lt;/b&gt;, &lt;b&gt;9&lt;/b&gt;} which both have ατε in &lt;strong&gt;2PL&lt;/strong&gt;. Or &lt;b&gt;PA-&lt;/b&gt;{&lt;b&gt;1&lt;/b&gt;, &lt;b&gt;2&lt;/b&gt;, &lt;b&gt;3&lt;/b&gt;} which all have ουσι(ν) in &lt;strong&gt;3PL&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Next we&#39;ll look at the middles.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part thirteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 12</title>
    <link href="https://jktauber.com/2017/08/16/tour-greek-morphology-part-12/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 12"/>
    <published>2017-08-16</published>
    <updated>2017-08-16</updated>
    <id>https://jktauber.com/2017/08/16/tour-greek-morphology-part-12</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/16/tour-greek-morphology-part-12/">&lt;p&gt;Part twelve of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;There is one very important verb we haven&#39;t looked at the paradigm of yet: the copula.&lt;/p&gt;
&lt;p&gt;For comparison, we&#39;ll put the present infinitive and indicative forms alongside the common endings of the &lt;strong&gt;μι&lt;/strong&gt; verbs we saw in &lt;a href=&#34;/2017/08/02/tour-greek-morphology-part-10/&#34;&gt;part 10&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;| INF     | εἶναι       | -ναι
| 1SG     | εἰμί        | -μι
| 2SG     | εἶ          | -ς
| 3SG     | ἐστί(ν)     | -σι(ν)
| 1PL     | ἐσμέν       | -μεν
| 2PL     | ἐστέ        | -τε
| 3PL     | εἰσί(ν)     | -ασι(ν)&lt;/p&gt;
&lt;p&gt;Notice:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;all but the &lt;strong&gt;INF&lt;/strong&gt; and &lt;strong&gt;2SG&lt;/strong&gt; are enclitic&lt;/li&gt;
&lt;li&gt;in the &lt;strong&gt;INF&lt;/strong&gt;, &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; we find the expected ending&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt; are slightly different&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;2SG&lt;/strong&gt; is lacking the ending all together&lt;/li&gt;
&lt;li&gt;with all the endings removed, we sometimes have ἐσ and sometimes εἰ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Recall in &lt;a href=&#34;/2017/07/23/tour-greek-morphology-part-9/&#34;&gt;part 9&lt;/a&gt; we said that &#34;it was not uncommon for Attic-Ionic to have σι for τι in other dialects&#34; (a type of lenition). Perhaps the &lt;strong&gt;3SG&lt;/strong&gt; ending was originally τι(ν) and it just became σι(ν) in all the &lt;strong&gt;μι&lt;/strong&gt; verbs except the copula.&lt;/p&gt;
&lt;p&gt;And in &lt;a href=&#34;/2017/08/03/tour-greek-morphology-part-11/&#34;&gt;part 11&lt;/a&gt; we questioned &#34;why the active &lt;strong&gt;2SG&lt;/strong&gt; and &lt;strong&gt;3SG&lt;/strong&gt; forms don’t end in σι and τι to mirror σαι and ται.&#34; Well, what if they originally did and some change masked this?&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;3SG&lt;/strong&gt; τι(ν) would be explained as an original τι with the occasional movable nu. The &lt;strong&gt;3SG&lt;/strong&gt; σι(ν) would just come from τι(ν) via the tendency for τι to become σι in Attic-Ionic.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;2SG&lt;/strong&gt; εἶ is perfectly explainable as coming from ἐσι with the intervocalic sigma dropping. In fact, we find ἐσσί in Homer, Pindar and other writings in older or more conservative dialects. If εἶ came from an older ἐσσί, that would not only suggest a -σι ending but a ἐσ stem. [&lt;strong&gt;EDIT&lt;/strong&gt;: it&#39;s also possible, or even likely given the evidence of other Indo-European languages, that the first sigma was dropped much earlier in Proto-Indo-European and the instances of ἐσσί are actually a reintroduction of a double sigma by analogy with the &lt;strong&gt;3SG&lt;/strong&gt;!]&lt;/p&gt;
&lt;p&gt;Is it plausible that εἶναι came from ἐσ+ναι and εἰμί from ἐσ+μι? Absolutely! A sigma dropping and the preceding vowel lengthening would explain those forms. But why would we still find ἐσμέν rather than, say, εἰμέν? Well it turns out Homer and Herodotus &lt;em&gt;do&lt;/em&gt; have εἰμέν. There is clearly tension between keeping the ἐσ and going to εἰ and different dialects went a different way even at the level of different cells in the paradigm.&lt;/p&gt;
&lt;p&gt;In the &lt;strong&gt;3PL&lt;/strong&gt;, we do find that Homer (as well as εἰσί) has ἔᾱσι, following the &lt;strong&gt;3PL&lt;/strong&gt; ending of the other &lt;strong&gt;μι&lt;/strong&gt; verbs, but much as the &lt;strong&gt;ω&lt;/strong&gt; verb ending -ουσι comes from -οντι, we can explain εἰσί from ἐσ+ντι.&lt;/p&gt;
&lt;p&gt;Further justification of earlier forms comes from comparison with other Indo-European languages but doing that would take us too far afield for this survey. For now, we&#39;ll just summarize what we have for this new paradigm.&lt;/p&gt;
&lt;p&gt;We&#39;ll call this &lt;strong&gt;PA-10&lt;/strong&gt; but because of the ἐσ/εἰ alternation, we can&#39;t really isolate distinguishers across the entire paradigm other than the full words themselves.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA-10&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;εἶναι&lt;/td&gt;
&lt;td&gt;ἐσ+ναι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;εἰμί&lt;/td&gt;
&lt;td&gt;ἐσ+μι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;εἶ&lt;/td&gt;
&lt;td&gt;ἐσ+σι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;ἐστί(ν)&lt;/td&gt;
&lt;td&gt;ἐσ+τι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;ἐσμέν&lt;/td&gt;
&lt;td&gt;ἐσ+μεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;ἐστέ&lt;/td&gt;
&lt;td&gt;ἐσ+τε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;εἰσί(ν)&lt;/td&gt;
&lt;td&gt;ἐσ+ντι&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;As always, I stress this is a historical explanation, not an explanation of what was going on in the minds of native Greek speakers nor the best way to initially learn the forms of the copula.&lt;/p&gt;
&lt;p&gt;The μι/σι/τι/ντι pattern is fascinating, though; with its parallel to the middle μαι/σαι/ται/νται.&lt;/p&gt;
&lt;p&gt;There are still, of course, open questions, like the relationship between these endings and those of the &lt;strong&gt;ω&lt;/strong&gt; verbs that differ (not least of which -μι vs -ω itself!) Or the fact that our other &lt;strong&gt;μι&lt;/strong&gt; verbs seemed to use a different vowel in the singular than the plural and there&#39;s no sign of that in the copula. [&lt;strong&gt;EDIT&lt;/strong&gt;: also as noted, ἐσσι as the original form is problematic; it was likely ἐσι in Proto-Greek.]&lt;/p&gt;
&lt;p&gt;One earlier observation we can say a little bit more about now, though, is the alpha in the -ασι(ν) ending which previously seemed inexplicable. As we shall see later on, when a &lt;strong&gt;ν&lt;/strong&gt; can&#39;t be pronounced in a particular context, it often became an &lt;strong&gt;α&lt;/strong&gt; rather than just dropping out completely. Given we reconstruct an &lt;strong&gt;ν&lt;/strong&gt; in the &lt;strong&gt;3PL&lt;/strong&gt; ending, this &lt;strong&gt;ν&lt;/strong&gt; becoming an &lt;strong&gt;α&lt;/strong&gt; rather than dropping out entirely explains -ασι(ν) (with no compensatory lengthening). Because the &lt;strong&gt;μι&lt;/strong&gt; verbs (unlike the &lt;strong&gt;ω&lt;/strong&gt; verbs) have a &lt;strong&gt;3SG&lt;/strong&gt; ending in σι(ν), keeping the &lt;strong&gt;α&lt;/strong&gt; around was useful to discriminate between the singular and plural. In the case of the copula, though, the &lt;strong&gt;3SG&lt;/strong&gt; retained the &lt;strong&gt;τ&lt;/strong&gt; so there was less reason to keep the old &lt;strong&gt;ν&lt;/strong&gt; (pronounced as &lt;strong&gt;α&lt;/strong&gt;) around and it could just drop out entirely.&lt;/p&gt;
&lt;p&gt;We&#39;ve now covered the major present infinitive and indicative paradigms. In the next few posts in this series we&#39;re going to step back a little and talk about the relationship between paradigms, the notion of lemmas and citation forms, some more about cell filling and class inference, and some statistics about the frequency of these different paradigms we&#39;ve looked at. Then we&#39;ll move beyond the present and look at a whole new set of paradigms!&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part twelve of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Speaking in Berlin</title>
    <link href="https://jktauber.com/2017/08/05/speaking-berlin/" rel="alternate" type="text/html" title="Speaking in Berlin"/>
    <published>2017-08-05</published>
    <updated>2017-08-05</updated>
    <id>https://jktauber.com/2017/08/05/speaking-berlin</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/05/speaking-berlin/">&lt;p&gt;This afternoon I&#39;m heading off to Berlin for my first Society of Biblical Literature International Meeting, where I&#39;ll be speaking on adaptive reading environments for Biblical Greek.&lt;/p&gt;
&lt;p&gt;I&#39;ve attended a number of SBL Annual Meetings in the US and spoken at two but this will be my first International Meeting. At the invitation of Professor Nicolai Winther-Nielsen, I&#39;ll be giving an update on the talk I gave at last year&#39;s Annual Meeting.&lt;/p&gt;
&lt;p&gt;Here&#39;s my abstract:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The Route to Adaptive Learning of Greek&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;One of the promises of machine-actionable linguistic data linked to biblical texts is the enablement of new types of language learning tools. At their simplest, such tools might involve adding the necessary scaffolding to enable students to read more text than they otherwise might by providing glosses for rarer words or help on idioms, irregular morphology, and unusual syntactic constructions. Such tools, however, are hardly novel and have long been manually produced in printed form. Equivalent electronic versions don&#39;t really take advantage of what&#39;s possible. In this paper I discuss an online reading environment for Ancient Greek, and the Greek New Testament in particular, that takes advantage of the availability of open, machine-actionable resources such as treebanks and morphological analyses for more automated and consistent generation of scaffolding but which goes a step further by being adaptive to an individual student&#39;s knowledge at a given point. Such knowledge need not be explicitly provided (although it can be: to align with a particular textbook, for example). It can also be built up implicitly from what the reader is requesting more information or help on: What words are they having trouble remembering the meaning of? What forms are they having trouble parsing? The model of student knowledge is then integrated with learning tools such as spaced-repetition flash cards and parsing drills with the results of these tools then feeding back into better adapting scaffolding for reading. The online reading environment will be open source and potentially applicable to a wide range of other language and texts provided the necessary linguistic data is available.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Thank you to Professor Winther-Nielsen for inviting me.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This afternoon I&#39;m heading off to Berlin for my first Society of Biblical Literature International Meeting, where I&#39;ll be speaking on adaptive reading environments for Biblical Greek.</summary>
  </entry><entry>
    <title type="html">First Week of New Vocab Site</title>
    <link href="https://jktauber.com/2017/08/05/first-week-new-vocab-site/" rel="alternate" type="text/html" title="First Week of New Vocab Site"/>
    <published>2017-08-05</published>
    <updated>2017-08-05</updated>
    <id>https://jktauber.com/2017/08/05/first-week-new-vocab-site</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/05/first-week-new-vocab-site/">&lt;p&gt;Last week I launched a site for Greek vocabulary. Here&#39;s how the first week has gone.&lt;/p&gt;
&lt;p&gt;Over time &lt;a href=&#34;http://vocab.oxlos.org/&#34;&gt;http://vocab.oxlos.org/&lt;/a&gt; will contain a variety of tools for learning and assessing Greek vocabulary. As mentioned in &lt;a href=&#34;/2017/07/29/new-site-vocabulary-experiments/&#34;&gt;my blog post&lt;/a&gt; a week ago, I&#39;m starting with some experiments based on the work of Paul Nation.&lt;/p&gt;
&lt;p&gt;I&#39;m delighted with the response so far and am very thankful to everyone who has participated. In the first week 58 people signed up, 37 people completed at least one full activity with 19 completing more than one and six people completing at least four activities.&lt;/p&gt;
&lt;p&gt;Thanks to &lt;a href=&#34;https://thepatrologist.com&#34;&gt;Seumas Macdonald&lt;/a&gt;, I expanded the initial New Testament vocabulary testing a couple of days ago to some Patristic vocabulary. I&#39;ll also be adding some classical Greek vocabulary soon.&lt;/p&gt;
&lt;p&gt;As my previous post says, some of my initial research questions are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;how reliable is a test like Nation&#39;s vocabulary level test at estimating one’s NT Greek vocabulary size?&lt;/li&gt;
&lt;li&gt;how much is frequency a factor in how likely a student is to know a word?&lt;/li&gt;
&lt;li&gt;what other factors contribute to likelihood a student knows a word?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I do need to continue to gather data but so far the Nation-style test seems to be working well and individual frequency bands actually do seem very good indicators of overall vocabulary size. I&#39;ll publish results with analysis over time. I&#39;ll also continue to release new activities.&lt;/p&gt;
&lt;p&gt;As well as expanding the vocabulary to broader corpora and other parts of speech besides nouns, I also want to explore the impact of English cognates and relatedness between lexemes due to derivation. I&#39;ll also be adding some additional activity types based on the work of other vocabulary acquisition researchers such as Schmitt and Meara.&lt;/p&gt;
&lt;p&gt;Thanks again to everyone who has participated so far and please continue to do so (and share a link to the site with Greek students, particularly those at a less-advanced level).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Last week I launched a site for Greek vocabulary. Here&#39;s how the first week has gone.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 11</title>
    <link href="https://jktauber.com/2017/08/03/tour-greek-morphology-part-11/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 11"/>
    <published>2017-08-03</published>
    <updated>2017-08-03</updated>
    <id>https://jktauber.com/2017/08/03/tour-greek-morphology-part-11</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/03/tour-greek-morphology-part-11/">&lt;p&gt;Part eleven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/2017/08/02/tour-greek-morphology-part-10/&#34;&gt;part 10&lt;/a&gt;, we looked at some new active forms. Now it&#39;s time to look at the corresponding middle forms.&lt;/p&gt;
&lt;p&gt;| INF     | δείκνυσθαι  | τίθεσθαι    | δίδοσθαι    | ἵστασθαι
| 1SG     | δείκνυμαι   | τίθεμαι     | δίδομαι     | ἵσταμαι
| 2SG     | δείκνυσαι   | τίθεσαι     | δίδοσαι     | ἵστασαι
| 3SG     | δείκνυται   | τίθεται     | δίδοται     | ἵσταται
| 1PL     | δεικνύμεθα  | τιθέμεθα    | διδόμεθα    | ἱστάμεθα
| 2PL     | δείκνυσθε   | τίθεσθε     | δίδοσθε     | ἵστασθε
| 3PL     | δείκνυνται  | τίθενται    | δίδονται    | ἵστανται&lt;/p&gt;
&lt;p&gt;In the middle forms, there is no change in the vowel and so it doesn&#39;t need to be included in the distinguisher. In this sense, we really only have one distinguisher paradigm for all these forms in the middle.&lt;/p&gt;
&lt;p&gt;However, if we were contrasting against the active forms as well, we could identify a &lt;strong&gt;PM-6&lt;/strong&gt;, &lt;strong&gt;PM-7&lt;/strong&gt;, &lt;strong&gt;PM-8&lt;/strong&gt;, and &lt;strong&gt;PM-9&lt;/strong&gt; paired up with &lt;strong&gt;PA-6&lt;/strong&gt;, &lt;strong&gt;PA-7&lt;/strong&gt;, &lt;strong&gt;PA-8&lt;/strong&gt;, &lt;strong&gt;PA-9&lt;/strong&gt;:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PM-6&lt;/th&gt;
&lt;th&gt;PM-7&lt;/th&gt;
&lt;th&gt;PM-8&lt;/th&gt;
&lt;th&gt;PM-9&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;Xσθαι&lt;/td&gt;
&lt;td&gt;Xεσθαι&lt;/td&gt;
&lt;td&gt;Xοσθαι&lt;/td&gt;
&lt;td&gt;Xασθαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xμαι&lt;/td&gt;
&lt;td&gt;Xεμαι&lt;/td&gt;
&lt;td&gt;Xομαι&lt;/td&gt;
&lt;td&gt;Xαμαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xσαι&lt;/td&gt;
&lt;td&gt;Xεσαι&lt;/td&gt;
&lt;td&gt;Xοσαι&lt;/td&gt;
&lt;td&gt;Xασαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xται&lt;/td&gt;
&lt;td&gt;Xεται&lt;/td&gt;
&lt;td&gt;Xοται&lt;/td&gt;
&lt;td&gt;Xαται&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xμεθα&lt;/td&gt;
&lt;td&gt;Xέμεθα&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xάμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xσθε&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xοσθε&lt;/td&gt;
&lt;td&gt;Xασθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xνται&lt;/td&gt;
&lt;td&gt;Xενται&lt;/td&gt;
&lt;td&gt;Xονται&lt;/td&gt;
&lt;td&gt;Xανται&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;But the common endings for the &lt;strong&gt;μι&lt;/strong&gt; verbs are very clear. Here they are alongside our previously reconstructed endings for the previous middle paradigms:&lt;/p&gt;
&lt;p&gt;| INF     | -σθαι   | ε σθαι
| 1SG     | -μαι    | ο μαι
| 2SG     | -σαι    | ε σαι &amp;gt; ῃ
| 3SG     | -ται    | ε ται
| 1PL     | -μεθα   | ο μεθα
| 2PL     | -σθε    | ε σθε
| 3PL     | -νται   | ο νται&lt;/p&gt;
&lt;p&gt;This not only provides clear support for the ε+σαι reconstruction of the ῃ &lt;strong&gt;MID 2SG&lt;/strong&gt; form but also makes clear how the &lt;strong&gt;ω&lt;/strong&gt; verbs (both barytone and circumflex) use the same endings as the &lt;strong&gt;μι&lt;/strong&gt; verbs but with the ε/ο vowel (the so-called thematic vowel) attached to the stem before the ending. In the middle, this is the only difference (slightly obscured when ῃ is used in the &lt;strong&gt;2SG&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;As mentioned in &lt;a href=&#34;/2017/07/23/tour-greek-morphology-part-9/&#34;&gt;part 9&lt;/a&gt;, there are some tantalising patterns here: the αι in 5 out of 7 cells; the μ/σ/τ in the 1st/2nd/3rd person.&lt;/p&gt;
&lt;p&gt;The appearance of μι in the &lt;strong&gt;ACT 1SG&lt;/strong&gt; is particular interesting because we now have a μι/μαι contrast in the &lt;strong&gt;1SG&lt;/strong&gt; between active and middle which exactly mirrors the οντι/ονται contrast in the &lt;strong&gt;3PL&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;One might well question why the active &lt;strong&gt;2SG&lt;/strong&gt; and &lt;strong&gt;3SG&lt;/strong&gt; forms don&#39;t end in σι and τι to mirror σαι and ται. Or why the active infinitive isn&#39;t σθι. Or why the &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; have only a vague relationship between the active and middle. And we still have the question of where the alpha in the &lt;strong&gt;ACT 3PL&lt;/strong&gt; ασι(ν) ending comes from. We&#39;ll touch on some of these questions in the next post and we &lt;em&gt;will&lt;/em&gt; reveal some more historical and dialectal patterns.&lt;/p&gt;
&lt;p&gt;But it is again worth reiterating that &lt;strong&gt;the primary role of a distinguisher is not to be decomposable but merely to discriminate meaning&lt;/strong&gt;. That there are patterns between the distinguishers at all is not a fundamental requirement of the role they play in conveying information. There may be historical reasons for the patterns (as we&#39;ve already seen) and learnability pressures that favour them (or even conspire to introduce them over time) but we should not &lt;em&gt;expect&lt;/em&gt; them and therefore view their absence as any kind of defect or irregularity.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part eleven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 10</title>
    <link href="https://jktauber.com/2017/08/02/tour-greek-morphology-part-10/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 10"/>
    <published>2017-08-02</published>
    <updated>2017-08-02</updated>
    <id>https://jktauber.com/2017/08/02/tour-greek-morphology-part-10</id>
    <content type="html" xml:base="https://jktauber.com/2017/08/02/tour-greek-morphology-part-10/">&lt;p&gt;Part ten of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In previous posts we&#39;ve explored five distinct active and middle paradigms in the present indicative and infinitive.&lt;/p&gt;
&lt;p&gt;There are still a number of inflectional classes in the present we haven&#39;t covered yet and we&#39;ll introduce a few more active forms in this post.&lt;/p&gt;
&lt;table&gt;
    &lt;tr&gt;
        &lt;th&gt;INF&lt;/th&gt;
        &lt;td&gt;&lt;i&gt;δεικνύναι&lt;/i&gt; &amp;dagger;&lt;/td&gt;
        &lt;td&gt;τιθέναι&lt;/td&gt;
        &lt;td&gt;διδόναι&lt;/td&gt;
        &lt;td&gt;-ιστάναι &amp;dagger;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;1SG&lt;/th&gt;
        &lt;td&gt;δείκνυμι&lt;/td&gt;
        &lt;td&gt;τίθημι&lt;/td&gt;
        &lt;td&gt;δίδωμι&lt;/td&gt;
        &lt;td&gt;-ίστημι&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;2SG&lt;/th&gt;
        &lt;td&gt;&lt;i&gt;δείκνυς&lt;/i&gt; &amp;dagger;&lt;/td&gt;
        &lt;td&gt;&lt;i&gt;τίθης&lt;/i&gt;&lt;/td&gt;
        &lt;td&gt;-δίδως&lt;/td&gt;
        &lt;td&gt;&lt;i&gt;ἵστης&lt;/i&gt; &amp;dagger;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;3SG&lt;/th&gt;
        &lt;td&gt;δείκνυσι(ν)&lt;/td&gt;
        &lt;td&gt;τίθησι(ν)&lt;/td&gt;
        &lt;td&gt;δίδωσι(ν)&lt;/td&gt;
        &lt;td&gt;-ίστησι(ν)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;1PL&lt;/th&gt;
        &lt;td&gt;&lt;i&gt;δείκνυμεν&lt;/i&gt;&lt;/td&gt;
        &lt;td&gt;-τίθεμεν&lt;/td&gt;
        &lt;td&gt;&lt;i&gt;δίδομεν&lt;/i&gt;&lt;/td&gt;
        &lt;td&gt;&lt;i&gt;ἵσταμεν&lt;/i&gt; &amp;dagger;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;2PL&lt;/th&gt;
        &lt;td&gt;&lt;i&gt;δείκνυτε&lt;/i&gt;&lt;/td&gt;
        &lt;td&gt;&lt;i&gt;τίθετε&lt;/i&gt;&lt;/td&gt;
        &lt;td&gt;&lt;i&gt;δίδοτε&lt;/i&gt;&lt;/td&gt;
        &lt;td&gt;&lt;i&gt;ἵστατε&lt;/i&gt; &amp;dagger;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;3PL&lt;/th&gt;
        &lt;td&gt;&lt;i&gt;δεικνύασι(ν)&lt;/i&gt;&lt;/td&gt;
        &lt;td&gt;τιθέασι(ν)&lt;/td&gt;
        &lt;td&gt;διδόασι(ν)&lt;/td&gt;
        &lt;td&gt;&lt;i&gt;ἱστᾶσι(ν)&lt;/i&gt;&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;In the above table, &lt;i&gt;italics&lt;/i&gt; indicates the form does not appear in the NT but the cell is filled from elsewhere; a preceding hyphen indicates the NT only contains the form with a preverb; and &amp;dagger; indicates the NT has another form from one of the inflectionals classes we&#39;ve already seen (more on that later).&lt;/p&gt;
&lt;p&gt;It is worth noting that there are very few verbs that follow these paradigms but they are very common. In a future post, we&#39;ll look at the frequencies in more detail.&lt;/p&gt;
&lt;p&gt;Let&#39;s start with the distinguishers (removing the common elements in each column):&lt;/p&gt;
&lt;table&gt;
    &lt;tr&gt;
        &lt;th&gt;&amp;nbsp;&lt;/th&gt;
        &lt;th&gt;PA-6&lt;/th&gt;
        &lt;th&gt;PA-7&lt;/th&gt;
        &lt;th&gt;PA-8&lt;/th&gt;
        &lt;th&gt;PA-9&lt;/th&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;INF&lt;/th&gt;
        &lt;td&gt;Xναι&lt;/td&gt;
        &lt;td&gt;Xέναι&lt;/td&gt;
        &lt;td&gt;Xόναι&lt;/td&gt;
        &lt;td&gt;Xάναι&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;1SG&lt;/th&gt;
        &lt;td&gt;Xμι&lt;/td&gt;
        &lt;td&gt;Xημι&lt;/td&gt;
        &lt;td&gt;Xωμι&lt;/td&gt;
        &lt;td&gt;Xημι&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;2SG&lt;/th&gt;
        &lt;td&gt;Xς&lt;/td&gt;
        &lt;td&gt;Xης&lt;/td&gt;
        &lt;td&gt;Xως&lt;/td&gt;
        &lt;td&gt;Xης&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;3SG&lt;/th&gt;
        &lt;td&gt;Xσι(ν)&lt;/td&gt;
        &lt;td&gt;Xησι(ν)&lt;/td&gt;
        &lt;td&gt;Xωσι(ν)&lt;/td&gt;
        &lt;td&gt;Xησι(ν)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;1PL&lt;/th&gt;
        &lt;td&gt;Xμεν&lt;/td&gt;
        &lt;td&gt;Xεμεν&lt;/td&gt;
        &lt;td&gt;Xομεν&lt;/td&gt;
        &lt;td&gt;Xαμεν&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;2PL&lt;/th&gt;
        &lt;td&gt;Xτε&lt;/td&gt;
        &lt;td&gt;Xετε&lt;/td&gt;
        &lt;td&gt;Xοτε&lt;/td&gt;
        &lt;td&gt;Xατε&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;3PL&lt;/th&gt;
        &lt;td&gt;Xασι(ν)&lt;/td&gt;
        &lt;td&gt;Xέασι(ν)&lt;/td&gt;
        &lt;td&gt;Xόασι(ν)&lt;/td&gt;
        &lt;td&gt;Xᾶσι(ν)&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;At this point, the relationship between &lt;strong&gt;PA-6&lt;/strong&gt; and each of &lt;strong&gt;PA-7&lt;/strong&gt;, &lt;strong&gt;PA-8&lt;/strong&gt;, &lt;strong&gt;PA-9&lt;/strong&gt; seem to mirror that between &lt;strong&gt;PA-1&lt;/strong&gt; and each of &lt;strong&gt;PA-2&lt;/strong&gt;, &lt;strong&gt;PA-3&lt;/strong&gt;, &lt;strong&gt;PA-4&lt;/strong&gt; respectively. This is especially evident in the infinitive and plurals where (-, ε, ο, α) is to (&lt;strong&gt;PA-1&lt;/strong&gt;, &lt;strong&gt;PA-2&lt;/strong&gt;, &lt;strong&gt;PA-3&lt;/strong&gt;, &lt;strong&gt;PA-4&lt;/strong&gt;) is to (&lt;strong&gt;PA-6&lt;/strong&gt;, &lt;strong&gt;PA-7&lt;/strong&gt;, &lt;strong&gt;PA-8&lt;/strong&gt;, &lt;strong&gt;PA-9&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;If we isolate just the common endings (recurring horizontally) and place them alongside the endings we reconstructed in &lt;a href=&#34;/2017/07/23/tour-greek-morphology-part-9/&#34;&gt;part 9&lt;/a&gt;,  we get:&lt;/p&gt;
&lt;table&gt;
    &lt;tr&gt;
        &lt;th&gt;INF&lt;/th&gt;
        &lt;td&gt;-ναι&lt;/td&gt;
        &lt;td&gt;ε εν&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;1SG&lt;/th&gt;
        &lt;td&gt;-μι&lt;/td&gt;
        &lt;td&gt;ω -&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;2SG&lt;/th&gt;
        &lt;td&gt;-ς&lt;/td&gt;
        &lt;td&gt;ε ις&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;3SG&lt;/th&gt;
        &lt;td&gt;-σι(ν)&lt;/td&gt;
        &lt;td&gt;ε ι&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;1PL&lt;/th&gt;
        &lt;td&gt;-μεν&lt;/td&gt;
        &lt;td&gt;ο μεν&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;2PL&lt;/th&gt;
        &lt;td&gt;-τε&lt;/td&gt;
        &lt;td&gt;ε τε&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;th&gt;3PL&lt;/th&gt;
        &lt;td&gt;-ασι(ν)&lt;/td&gt;
        &lt;td&gt;ο ντι &gt; ουσι(ν)&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Notice that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;thematic vowels seem to be entirely missing&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;3PL&lt;/strong&gt; has an alpha, though&lt;/li&gt;
&lt;li&gt;some endings seem identical except for the lack of thematic vowel (&lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;some are close (&lt;strong&gt;2SG&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;some are not so close (&lt;strong&gt;INF&lt;/strong&gt; and &lt;strong&gt;3SG&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;but now the &lt;strong&gt;3SG&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt; are almost identical to &lt;em&gt;each other&lt;/em&gt; in these new paradigms&lt;/li&gt;
&lt;li&gt;the &lt;strong&gt;1SG&lt;/strong&gt; seems completely unrelated&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Because of the lack of thematic vowels (seen most strikingly in the &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;2PL&lt;/strong&gt; forms), these types of verbs are often called &lt;strong&gt;athematic&lt;/strong&gt; verbs. Because of the completely different ending μι in the &lt;strong&gt;1SG&lt;/strong&gt;, they are also often called &lt;strong&gt;μι&lt;/strong&gt; verbs. They &lt;em&gt;could&lt;/em&gt; be called &lt;strong&gt;ναι&lt;/strong&gt; verbs, but I&#39;m not aware of anyone who does that. Those three things are the most obvious contrasts, though.&lt;/p&gt;
&lt;p&gt;When we look back at the full forms, we also notice:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the vowel preceding the endings is different in the singular and the plural&lt;/li&gt;
&lt;li&gt;ἱστᾶσι(ν) is accented in a way that suggests a contraction, probably from αα which makes sense given the other plural forms.&lt;/li&gt;
&lt;li&gt;έα and όα haven&#39;t contracted in the &lt;strong&gt;3PL&lt;/strong&gt; (and note if they did, they would be identical to the &lt;strong&gt;3SG&lt;/strong&gt; in &lt;strong&gt;PA-7&lt;/strong&gt; and &lt;strong&gt;PA-8&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is as if the stems are τιθη, διδω, and ἱστη in the singular and τιθε, διδο, and ἱστα in the infinitive and plural. This is noteworthy for at least three reasons.&lt;/p&gt;
&lt;p&gt;Firstly, it&#39;s the first time we&#39;ve seen a contrast that only indicates number and not person.&lt;/p&gt;
&lt;p&gt;Secondly, it&#39;s not (just) a different ending indicating the number but a change in the vowel.&lt;/p&gt;
&lt;p&gt;And thirdly, it&#39;s redundant as the ending alone still conveys number.&lt;/p&gt;
&lt;p&gt;On the surface, it appears that δεικνυ keeps its vowel the same although length is not clear yet.&lt;/p&gt;
&lt;p&gt;It is important to note that, unlike the circumflex verbs &lt;strong&gt;PA-2&lt;/strong&gt; through &lt;strong&gt;PA-5&lt;/strong&gt; which, as we have shown, all have the same endings (as each other and as &lt;strong&gt;PA-1&lt;/strong&gt;), &lt;strong&gt;PA-6&lt;/strong&gt; through &lt;strong&gt;PA-9&lt;/strong&gt; have a new set of common endings distinct from those of &lt;strong&gt;PA-1&lt;/strong&gt; thru &lt;strong&gt;PA-5&lt;/strong&gt; (with some overlap). The paradigms cannot be explained merely as stems interacting differently with the &lt;em&gt;same&lt;/em&gt; endings.&lt;/p&gt;
&lt;p&gt;We will pick up this point again soon, but first (in the next post), we&#39;ll look at the middle forms of our new verbs.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part ten of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">NT Book Similarity by Jaccard Distance of Lemma Sets</title>
    <link href="https://jktauber.com/2017/07/29/nt-book-similarity-jaccard-distance-lemma-sets/" rel="alternate" type="text/html" title="NT Book Similarity by Jaccard Distance of Lemma Sets"/>
    <published>2017-07-29</published>
    <updated>2017-07-29</updated>
    <id>https://jktauber.com/2017/07/29/nt-book-similarity-jaccard-distance-lemma-sets</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/29/nt-book-similarity-jaccard-distance-lemma-sets/">&lt;p&gt;I was thinking about vocabulary differences between books of the New Testament and decided to see what happens when you do a hierarchical clustering analysis of NT books using the Jaccard distance of their lemma sets.&lt;/p&gt;
&lt;div style=&#34;color: #F00; background: #FDD; padding: 8px 16px; margin-bottom: 1em;&#34;&gt;
&lt;b&gt;UPDATE&lt;/b&gt;: I&#39;m now convinced much (although not all) of this is due to length effects. If you think about it, the Jaccard distance between a large set and a small set is going to be large just by virtue of the large set having more in it than the small set. This will naturally group the non-letters together, the short letters together, Romans and the Corinthian letters together and so on. So until I come up with a way to correct Jaccard distance for text length, I&#39;d take this post with a huge grain of salt.
&lt;/div&gt;

&lt;p&gt;This is some old-school stylometry but the results are still pretty interesting. For each book, I calculated the set of lemmas and then, for each pair of books, calculated the Jaccard coefficient (the ratio of the intersection of the sets and the unions of the sets).&lt;/p&gt;
&lt;p&gt;I then did a cluster analysis using Ward&#39;s criterion and rendered the results as a dendrogram:&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;img src=&#34;/images/ward_jaccard_lemma.png&#34; width=&#34;100%&#34;&gt;
&lt;/div&gt;

&lt;p&gt;Notice that the first split is between the letters and non-letters.&lt;/p&gt;
&lt;p&gt;Within the non-letters, John&#39;s Gospel and Revelation cluster together as do Acts and the Synoptics. The Synoptics cluster with each other more than they do with Acts. Matthew and Mark cluster together more than they do with Luke.&lt;/p&gt;
&lt;p&gt;The highest division in the letters is between:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the non-pastoral Pauline epistles plus Hebrews, James and 1 Peter&lt;/li&gt;
&lt;li&gt;the pastorals plus the rest of the general epistles (2 Peter, the Johannine epistles and Jude)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That first division of letters further clusters into:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Galatians, Ephesians, Philippians, Colossians, 1 Thessalonians, 2 Thessalonians&lt;/li&gt;
&lt;li&gt;Romans, 1 Corinthians, 2 Corinthians, Hebrews, James and 1 Peter&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Ephesians and Colossians cluster together, the two epistles to the Thessalonians cluster together, and Galatians and Philippians cluster together.&lt;/p&gt;
&lt;p&gt;Romans, 1 Corinthians, and 2 Corinthians cluster (although 1 Corinthians clusters closer to Romans than to 2 Corinthians). James and 1 Peter cluster. Hebrews is in the same overall group but clusters closer to the Romans/Corinthian subgroup.&lt;/p&gt;
&lt;p&gt;The second division of letters clusters into:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Philemon, 2 John, 3 John&lt;/li&gt;
&lt;li&gt;Titus, 1 Timothy, 2 Timothy&lt;/li&gt;
&lt;li&gt;Jude, 1 John, 2 Peter&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;with the second and third clustering slightly closer than the first.&lt;/p&gt;
&lt;p&gt;2 John and 3 John cluster much closer to each other than to Philemon. The epistles to Timothy cluster slightly closer together than they do to Titus. 1 John and 2 Peter cluster slightly closer together than they do with Jude.&lt;/p&gt;
&lt;p&gt;I haven&#39;t thought about length effects here but they may influence the clustering of very short books together (and possibly very long books). A lot of the clustering does follow similar lengths so it&#39;s definitely worth thinking more about.&lt;/p&gt;
&lt;p&gt;Of course, there&#39;s nothing new about this kind of analysis. As I said at the start, it&#39;s old school—the sort of thing I can imagine being published in a &#34;humanities computing&#34; journal in the 80s. But it&#39;s still interesting. And it might be even more interesting to apply to finer-grained text divisions and/or with properties other than lemmas.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I was thinking about vocabulary differences between books of the New Testament and decided to see what happens when you do a hierarchical clustering analysis of NT books using the Jaccard distance of their lemma sets.</summary>
  </entry><entry>
    <title type="html">New Site for Vocabulary Experiments</title>
    <link href="https://jktauber.com/2017/07/29/new-site-vocabulary-experiments/" rel="alternate" type="text/html" title="New Site for Vocabulary Experiments"/>
    <published>2017-07-29</published>
    <updated>2017-07-29</updated>
    <id>https://jktauber.com/2017/07/29/new-site-vocabulary-experiments</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/29/new-site-vocabulary-experiments/">&lt;p&gt;I&#39;ve put together a new little site to host various activities to research vocabulary knowledge and acquisition in the context of Ancient and Biblical Greek.&lt;/p&gt;
&lt;p&gt;The new site is at:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;http://vocab.oxlos.org/&#34;&gt;http://vocab.oxlos.org/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;While eventually there will be a range of activity types and some spaced repetition practice, there is just a single activity type at the moment, based on work by vocabulary acquisition expert Paul Nation in the 1980s and 1990s.&lt;/p&gt;
&lt;p&gt;It is a &lt;strong&gt;receptive&lt;/strong&gt; vocabulary test, which means it focuses on whether you can understand a word when you come across it in text rather than whether you can produce the word in the right context. Each step of the activity asks you to select a word that best matches a given gloss, taken over a list of word-gloss pairs with a range of different frequencies.&lt;/p&gt;
&lt;p&gt;Nation&#39;s original tests (for English as a Foreign Language learners) used word lists split into frequency bands like the top 2000, top 3000, top 5000, and so on.&lt;/p&gt;
&lt;p&gt;I took the common nouns in the Greek New Testament and similarly broke them in to frequency bands. Rather than have identically-sized buckets, I went by frequency cut offs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;bucket 1 : 32 or more times&lt;/li&gt;
&lt;li&gt;bucket 2 : 16 to 31 times&lt;/li&gt;
&lt;li&gt;bucket 3 : 4 to 15 times&lt;/li&gt;
&lt;li&gt;bucket 4 : 2 or 3 times&lt;/li&gt;
&lt;li&gt;bucket 5 : 1 time&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;(Whether these are appropriate buckets will be assessed as part of this work.)&lt;/p&gt;
&lt;p&gt;From each bucket, 36 word-gloss pairs were randomly chosen (the glosses coming from Dodson&#39;s public domain glosses of NT lexemes). Of those 36, only 18 are tested, the 18 untested words used for distractors. This follows Nation&#39;s approach.&lt;/p&gt;
&lt;p&gt;So each activity of this type involves 90 items. I&#39;ve so far generated two activities but it&#39;s easy for me to generate more over time. I&#39;ll also expand the items to other parts of speech and a larger Greek corpus (including Classical). As long as I have frequency information and glosses, I can easily generate activities.&lt;/p&gt;
&lt;p&gt;I also have some other types of activities I&#39;d like to implement, based on the research literature. I&#39;d like to roll out a new activity once every couple of weeks or so.&lt;/p&gt;
&lt;p&gt;There are some fairly basic, fundamental questions that I&#39;ll be able to start to answer once I get more people trying the initial activities:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;how reliable is a test like this at estimating one&#39;s NT Greek vocabulary size?&lt;/li&gt;
&lt;li&gt;how much is frequency a factor in how likely a student is to know a word?&lt;/li&gt;
&lt;li&gt;what other factors contribute to likelihood a student knows a word?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Future activities will be able to explore some of this in more detail such as the impact of English cognates or relatedness between lexemes due to derivation, etc.&lt;/p&gt;
&lt;p&gt;Ultimately this is all input into producing better learning tools. It will feed directly into the adaptive online reading environment I&#39;m currently working on.&lt;/p&gt;
&lt;p&gt;Thank you to everyone who has tried the activities so far and PLEASE continue to do more activities as I roll them out and help spread the word. The more people of varying ability I get doing these activities, the richer and more insightful the data will be.&lt;/p&gt;
&lt;p&gt;I&#39;ll share those insights on this blog as things progress.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve put together a new little site to host various activities to research vocabulary knowledge and acquisition in the context of Ancient and Biblical Greek.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 9</title>
    <link href="https://jktauber.com/2017/07/23/tour-greek-morphology-part-9/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 9"/>
    <published>2017-07-23</published>
    <updated>2017-07-23</updated>
    <id>https://jktauber.com/2017/07/23/tour-greek-morphology-part-9</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/23/tour-greek-morphology-part-9/">&lt;p&gt;Part nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/2017/07/17/tour-greek-morphology-part-8/&#34;&gt;part 8&lt;/a&gt; we saw, amongst other things, that the present active infinitive has a spurious diphthong ει from ε+ε whereas the the present active second and third person singulars have a ει that is a true ε+ι diphthong.&lt;/p&gt;
&lt;p&gt;This somewhat justifies our observation of the ις and ι pattern in the second and third person singulars across all the present actives we&#39;ve seen so far.&lt;/p&gt;
&lt;p&gt;If we show the &#34;inert&#34; part of the endings separated from the vowel that interacts with a preceding stem vowel to form the circumflex verbs, we get something like this:&lt;/p&gt;
&lt;p&gt;| &amp;nbsp; | active | middle |
| INF | ε ε ν | ε σθαι
| 1SG | ω - | ο μαι
| 2SG | ε ις | η ι (sometimes ε ι)
| 3SG | ε ι | ε ται
| 1PL | ο μεν | ο μεθα
| 2PL | ε τε | ε σθε
| 3PL | ου σι(ν) | ο νται&lt;/p&gt;
&lt;p&gt;You can see the predominance of initial ε and ο with three exceptions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the ω of the &lt;strong&gt;ACT 1SG&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;the ου of the &lt;strong&gt;ACT 3PL&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;the η of the &lt;strong&gt;MID 2SG&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We now know to ask the question: is ου in &lt;strong&gt;ACT 3PL&lt;/strong&gt; a spurious diphthong (from ο+ο) or a true diphthong (from o+υ)? If υ works the same way as ι in our contraction rules, it must be a spurious diphthong.&lt;/p&gt;
&lt;p&gt;There&#39;s additional evidence for this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In the Western Greek dialects (like Doric) we find -οντι&lt;/li&gt;
&lt;li&gt;It was not uncommon for Attic-Ionic to have σι for τι in other dialects (we&#39;ll encounter more examples later)&lt;/li&gt;
&lt;li&gt;Dentals like ν drop out in Attic-Ionic when followed by σ and this generally causes the preceding vowel to lengthen (what is called &lt;strong&gt;compensatory lengthening&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So it seems our ουσι(ν) was originally from the -οντι preserved in Doric.&lt;/p&gt;
&lt;p&gt;This introduces interesting parallels with the -ονται in the middle.&lt;/p&gt;
&lt;p&gt;What about the ῃ in the &lt;strong&gt;MID 2SG&lt;/strong&gt;? We don&#39;t need to go to another dialect to see traces of what&#39;s going on. In the NT we have the &lt;strong&gt;PM-4&lt;/strong&gt; circumflex verb:&lt;/p&gt;
&lt;p&gt;| INF |
| 1SG | καυχῶμαι
| 2SG | &lt;b&gt;καυχᾶσαι&lt;/b&gt;
| 3SG |
| 1PL | καυχώμεθα
| 2PL | καυχᾶσθε
| 3PL | καυχῶνται&lt;/p&gt;
&lt;p&gt;with &lt;strong&gt;ᾶσαι&lt;/strong&gt; for ᾷ. The ᾶσαι can be explained as the stem vowel α interacting with the ending εσαι. The ᾷ can be explained simply through the σ dropping out (and similarly the ῃ in the &lt;strong&gt;PM-1&lt;/strong&gt; and &lt;strong&gt;PM-2&lt;/strong&gt; and so on) plus our contraction rules.&lt;/p&gt;
&lt;p&gt;Interestingly, later Greek restored the uncontracted ending and we find it again in Modern Greek.&lt;/p&gt;
&lt;p&gt;And so we have the reconstructed endings:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;active&lt;/th&gt;
&lt;th&gt;middle&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;ε εν&lt;/td&gt;
&lt;td&gt;ε σθαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;ω -&lt;/td&gt;
&lt;td&gt;ο μαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;ε ις&lt;/td&gt;
&lt;td&gt;ε σαι &amp;gt; ῃ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;ε ι&lt;/td&gt;
&lt;td&gt;ε ται&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;ο μεν&lt;/td&gt;
&lt;td&gt;ο μεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;ε τε&lt;/td&gt;
&lt;td&gt;ε σθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;ο ντι &amp;gt; ουσι(ν)&lt;/td&gt;
&lt;td&gt;ο νται&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;There are some tantalising patterns here, especially in the middle: the αι in 5 out of 7 cells; the μ/σ/τ in the 1st/2nd/3rd person.&lt;/p&gt;
&lt;p&gt;As usual I want to emphasize the reconstructed forms in this table help explain things historically but should not necessarily be taken as an indication of a process that went on syncronically in the minds of native speakers. I&#39;m not aware of any evidence that native speakers would have, for example, thought of ουσι as being an underlying οντι, or ῃ as being an underlying εσαι.&lt;/p&gt;
&lt;p&gt;We haven&#39;t yet explained what&#39;s going on with the &lt;strong&gt;ACT 1SG&lt;/strong&gt; nor why ει would have been an alternative for ῃ in the &lt;strong&gt;MID 2SG&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;But other than the &lt;strong&gt;ACT 1SG&lt;/strong&gt;, all other endings start with either an ε or ο. We&#39;ll talk more about this later (including why this vowel is called the &lt;strong&gt;thematic vowel&lt;/strong&gt;) but note that which of the two vowels is used is completely predictable by what follows.&lt;/p&gt;
&lt;p&gt;If the following segment is nasal (μ or ν), the vowel is ο. If the following segment is ε, ι, σ, or τ, the vowel is ε. Most descriptions consider the ε the default and the nasal context leading to ο being the exception. But we could also look for features that ε, ι, σ, and τ have that μ and ν don&#39;t (other than just being NON-nasal).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 8</title>
    <link href="https://jktauber.com/2017/07/17/tour-greek-morphology-part-8/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 8"/>
    <published>2017-07-17</published>
    <updated>2017-07-17</updated>
    <id>https://jktauber.com/2017/07/17/tour-greek-morphology-part-8</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/17/tour-greek-morphology-part-8/">&lt;p&gt;Part eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;So far, just for the active, we&#39;ve suggested the following contraction rules.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;έει &amp;gt; εῖ&lt;/li&gt;
&lt;li&gt;έω &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;έε &amp;gt; εῖ&lt;/li&gt;
&lt;li&gt;έο &amp;gt; οῦ&lt;/li&gt;
&lt;li&gt;έου &amp;gt; οῦ&lt;/li&gt;
&lt;li&gt;άω &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;άε &amp;gt; ᾶ&lt;/li&gt;
&lt;li&gt;άει &amp;gt; ᾷ (in the indicative) and ᾶ (in the infinitive)&lt;/li&gt;
&lt;li&gt;άο &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;άου &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;όω &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;όε &amp;gt; οῦ&lt;/li&gt;
&lt;li&gt;όει &amp;gt; οῖ (in the indicative) and οῦ (in the infinitive)&lt;/li&gt;
&lt;li&gt;όο &amp;gt; οῦ&lt;/li&gt;
&lt;li&gt;όου &amp;gt; οῦ&lt;/li&gt;
&lt;li&gt;ήω &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;ήε &amp;gt; ῆ&lt;/li&gt;
&lt;li&gt;ήει &amp;gt; ῇ (in the indicative) and ῆ (in the infinitive)&lt;/li&gt;
&lt;li&gt;ήο &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;ήου &amp;gt; ῶ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this post I want to explain why these aren&#39;t just an arbitrary set of sound changes and that they are really quite systematic. We&#39;ll say a little bit about Greek orthography and build a model using some simple phonological features that explains the core contraction rules quite compactly.&lt;/p&gt;
&lt;p&gt;Before I do that, though, I want to emphasize again that I&#39;m not suggesting these &#34;rules&#34; need to be learned by the language learner. They are historical explanations for the spelling of circumflex verb endings in certain dialects and I&#39;m discussing them to give people a flavour for linguistic description. &lt;strong&gt;The best way to learn the circumflex verbs is to produce and read them in context.&lt;/strong&gt; It really doesn&#39;t take long to just intuitively know that ἀγαπᾷς is a second person singular or that ἀγαπᾶν is an infinitive. You don&#39;t need to know the contraction rules or how to model them with phonological features.&lt;/p&gt;
&lt;p&gt;But if you&#39;re interested in WHY the forms are ἀγαπᾷς and ἀγαπᾶν (including why one has an iota subscript and the other doesn&#39;t) keep reading!&lt;/p&gt;
&lt;h2&gt;Orthography&lt;/h2&gt;
&lt;p&gt;You&#39;ve probably been told that ε and o are always short vowels. As far the LETTERS themselves go, in our standard Greek orthography, that is true. But a long ε and a long o existed as sounds in Classical Greek and earlier. Different dialects wrote these differently. Some just wrote Ε and Ο regardless of whether they were long or short. This is similar to Α, Ι, or Υ, which could be used for both the short and long variants. The Ionians, however, used the digraphs ΕΙ and ΟΥ for the long-Ε and long-Ο respectively. At the time, this was NOT the same sound as the diphthongs ΕΙ and ΟΥ, despite being written the same. It is likely that the long ε and long ο were pronounced with the tongue a little higher up (hence closer to the way ι and υ were pronounced) to reduce any confusion with η and ω which were pronounced with a lower tongue, closer to α. The digraphs ΕΙ and ΟΥ, when used for the long ε and long ο are sometimes called &#34;spurious diphthongs&#34; because they weren&#39;t actually diphthongs at all, they were long monophthongs.&lt;/p&gt;
&lt;p&gt;The Greeks started to standardize on the Ionian spelling and, in 403 BC, Athens officially adopted the Ionian spelling.&lt;/p&gt;
&lt;p&gt;This purely orthographic convention explains why εε &amp;gt; ει and οο &amp;gt; ου. That doesn&#39;t mean ALL occurences of ει are long ε or all occurences of ου are long ο. ει and ου CAN be true diphthongs, but when they come from ε+ε or ο+ο respectively, they are just long monophthongs.&lt;/p&gt;
&lt;p&gt;Now as already mentioned, both short and long α was just written as α and so αα &amp;gt; α is a similarly straightforward contraction (the result being a long α). If you have a circumflex or an iota subscript, the α must have been long.&lt;/p&gt;
&lt;h2&gt;Basic Contractions&lt;/h2&gt;
&lt;p&gt;So the diagonals of this contraction table make sense:&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;&lt;th&gt;ε&lt;/th&gt;&lt;th&gt;ο&lt;/th&gt;&lt;th&gt;α&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ε&lt;/th&gt;&lt;td class=&#34;success&#34;&gt;ει&lt;/td&gt;&lt;td&gt;ου&lt;/td&gt;&lt;td&gt;η&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ο&lt;/th&gt;&lt;td&gt;ου&lt;/td&gt;&lt;td class=&#34;success&#34;&gt;ου&lt;/td&gt;&lt;td&gt;ω&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;α&lt;/th&gt;&lt;td&gt;α&lt;/td&gt;&lt;td&gt;ω&lt;/td&gt;&lt;td class=&#34;success&#34;&gt;α&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Now ε+ο and o+ε both result in a long ο (written ου). The order doesn&#39;t matter. The ο wins out over the ε and the ε assimilates to ο resulting in the equivalent to ο+ο.&lt;/p&gt;
&lt;p&gt;Both α+ο and ο+α result in ω and again order doesn&#39;t matter. At the time of the spelling standardization, ω was effectively in between α and ο so this makes sense.&lt;/p&gt;
&lt;p&gt;Note, however, that α+ε and ε+α don&#39;t behave the same way in our table above. α+ε results in α but ε+α results in η. We might expect both to be η given how α+ο and ο+α behaved. It seems that order matters in some cases but not others.&lt;/p&gt;
&lt;h2&gt;Phonological Features&lt;/h2&gt;
&lt;p&gt;One way we can model all this is by assigning each of the vowels binary features of &lt;strong&gt;low&lt;/strong&gt;, &lt;strong&gt;back&lt;/strong&gt;, and &lt;strong&gt;round&lt;/strong&gt; and making generalisations about those categories.&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
    &lt;img src=&#34;/images/mid-low-vowels.png&#34;&gt;
&lt;/div&gt;

&lt;p&gt;In other words:&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;&amp;nbsp;&lt;/th&gt;&lt;th&gt;low&lt;/th&gt;&lt;th&gt;back&lt;/th&gt;&lt;th&gt;round&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ε&lt;/th&gt;&lt;td&gt;&lt;big&gt;-&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;-&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;-&lt;/big&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ο&lt;/th&gt;&lt;td&gt;&lt;big&gt;-&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;+&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;+&lt;/big&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;η&lt;/th&gt;&lt;td&gt;&lt;big&gt;+&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;-&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;-&lt;/big&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ω&lt;/th&gt;&lt;td&gt;&lt;big&gt;+&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;+&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;+&lt;/big&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;α&lt;/th&gt;&lt;td&gt;&lt;big&gt;+&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;+&lt;/big&gt;&lt;/td&gt;&lt;td&gt;&lt;big&gt;-&lt;/big&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Note that not all combinations are possible and &lt;strong&gt;+round&lt;/strong&gt; implies &lt;strong&gt;+back&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;(We haven&#39;t included ι or υ here as they don&#39;t play a part in this analysis.)&lt;/p&gt;
&lt;p&gt;Now all the ε, ο, α contractions can be explained in terms of assimilation of &lt;strong&gt;+low&lt;/strong&gt; and &lt;strong&gt;+round&lt;/strong&gt; and &lt;em&gt;partial&lt;/em&gt; assimilation of &lt;strong&gt;+back&lt;/strong&gt;, as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the output is &lt;strong&gt;+low&lt;/strong&gt; if &lt;em&gt;either&lt;/em&gt; input vowel is &lt;strong&gt;+low&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;the output is &lt;strong&gt;+round&lt;/strong&gt; if &lt;em&gt;either&lt;/em&gt; input vowel is &lt;strong&gt;+round&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;the output is &lt;strong&gt;+back&lt;/strong&gt; if the &lt;em&gt;first&lt;/em&gt; input vowel is &lt;strong&gt;+back&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;the output is &lt;strong&gt;+back&lt;/strong&gt; if it is &lt;strong&gt;+round&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The rules also explain why any vowel + ω goes to ω. In fact, if you work them through, these simple rules explain all 23 contractions in our list at the top of the post (and more that haven&#39;t come in to play yet) with just one additional rule:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;if you have more than two vowels, the contraction is left associative&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are likely other solutions with other features and rules but my analysis roughly follows that of Sommerstein in &lt;em&gt;The Sound Pattern of Ancient Greek&lt;/em&gt;, that of Bubeník in &lt;em&gt;The Phonological Interpretation of Ancient Greek: A Pandialectal Analysis&lt;/em&gt; (which also considers differences in things like the Doric dialect), and apparently that of Lejeune in &lt;em&gt;Phonétique historique du mycénien et du grec ancien&lt;/em&gt; (on which Bubeník&#39;s is based). This style of analysis is typical of the early second half of the twentieth century so I&#39;m not claiming it&#39;s in any way state-of-the-art. But it demonstrates that the contraction rules are very systematic.&lt;/p&gt;
&lt;h2&gt;The Difference in the Infinitive vs Indicative&lt;/h2&gt;
&lt;p&gt;There is one final thing we haven&#39;t explicitly addressed but which is fully explained by these simple rules on features: why is άει sometimes ᾷ and sometimes ᾶ (and likewise why is όει sometimes οῖ and sometimes οῦ)?&lt;/p&gt;
&lt;p&gt;The answer is simply that if the ει is a spurious diphthong (i.e. actually just a long εε) then our simple rules will result in long ᾶ but if it&#39;s a true diphthong, the result is long α + ι which is written ᾷ. Similarly in the case of όει, a spurious diphthong will result in οῦ (from οεε &amp;gt; οοε &amp;gt; οο &amp;gt; ου) but a true diphthong in οῖ (οει &amp;gt; οοι &amp;gt; οι)).&lt;/p&gt;
&lt;p&gt;What this tells us is that the ει in the ειν ending in the infinitive is a spurious diphthong but the ει in εις and ει in the second and third person singular actives are true diphthongs.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Man Walks Into A Bar</title>
    <link href="https://jktauber.com/2017/07/16/man-walks-bar/" rel="alternate" type="text/html" title="A Man Walks Into A Bar"/>
    <published>2017-07-16</published>
    <updated>2017-07-16</updated>
    <id>https://jktauber.com/2017/07/16/man-walks-bar</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/16/man-walks-bar/">&lt;p&gt;I’ve thought for a while that “A man walks into a bar” jokes are a great example of how definiteness works in English. I mentioned this to Jonathan Robie in Cambridge and he seemed to like the example too so I thought I’d share it more broadly.&lt;/p&gt;
&lt;p&gt;Consider the standard joke form:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A man walks into a bar. The bartender says X. The man says Y.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Notice this has two indefinite articles and two definite articles. When do we use the indefinite article and when do we use the definite article?&lt;/p&gt;
&lt;p&gt;In our sentence above, we’ve neither been introduced to the man nor the bar before. And so we use the indefinite article.&lt;/p&gt;
&lt;p&gt;We can’t say “* &lt;strong&gt;The&lt;/strong&gt; man walks into a bar” unless he’s been introduced before. Likewise we can’t say “* &lt;strong&gt;the&lt;/strong&gt; bar” unless the bar’s been introduced before. For example,&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Chris is one crazy guy! The man walks into a bar...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;is fine if we take the man to be Chris. Similarly,&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You know that bar on 52nd Street? A man walks into the bar...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;works if the bar in the joke is the one on 52nd Street.&lt;/p&gt;
&lt;p&gt;If we were telling a second joke, we could use &lt;strong&gt;the&lt;/strong&gt; to indicate the man (or the bar) was the same but notice we’d have to use something like &lt;strong&gt;another&lt;/strong&gt; and NOT &lt;strong&gt;a&lt;/strong&gt; for introducing a second bar (or man):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Later, the man walks into another bar...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Later, another man walks into the bar...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Notice in our original joke, the third sentence starts “&lt;strong&gt;The&lt;/strong&gt; man”. This makes sense because that man has already been introduced. We wouldn’t say “* The man walks into a bar. The bartender says X. &lt;strong&gt;A&lt;/strong&gt; man says Y.” Even it were a different man, we’d probably use something like “&lt;strong&gt;Another&lt;/strong&gt; man”.&lt;/p&gt;
&lt;p&gt;But notice we &lt;em&gt;did&lt;/em&gt; use &lt;strong&gt;the&lt;/strong&gt; with the bartender even though he or she has NOT been introduced yet. The reason is our &lt;em&gt;frame&lt;/em&gt; for a bar is that it has a bartender. The existence of the bartender has effectively been set up by us having a bar and that’s the bartender we want to reference so it’s not a completely new reference. Saying “* A man walks into a bar. &lt;strong&gt;A&lt;/strong&gt; bartender says X” would be odd. Notice also that even if the bartender is a man, the following “The man says Y” is unambiguous.&lt;/p&gt;
&lt;p&gt;Even if there were more than one bartender (certainly possible, although not prototypical for the frame) we’d have to say something like “&lt;strong&gt;One of the&lt;/strong&gt; bartenders says X”.&lt;/p&gt;
&lt;p&gt;This can be demonstrated with an example where we EXPECT multiple instances.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A man walks into a classroom. &lt;strong&gt;One of the&lt;/strong&gt; students says X.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In this case, it would be odd to say “* &lt;strong&gt;A&lt;/strong&gt; student says X” and even odder to say “* &lt;strong&gt;the&lt;/strong&gt; student says X”. We want definiteness (because the classroom frame has already established the likelihood of a &lt;em&gt;group&lt;/em&gt; of students and that’s the group we want to reference a member of) but because it’s a group, we need to say “one of” to call out an individual.&lt;/p&gt;
&lt;p&gt;“One of the” calls out an indefinite member of a definite group.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I’ve thought for a while that “A man walks into a bar” jokes are a great example of how definiteness works in English. I mentioned this to Jonathan Robie in Cambridge and he seemed to like the example too so I thought I’d share it more broadly.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 7</title>
    <link href="https://jktauber.com/2017/07/14/tour-greek-morphology-part-7/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 7"/>
    <published>2017-07-14</published>
    <updated>2017-07-14</updated>
    <id>https://jktauber.com/2017/07/14/tour-greek-morphology-part-7</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/14/tour-greek-morphology-part-7/">&lt;p&gt;Part seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;κλῶμεν in 1Co 10.16 is clearly &lt;strong&gt;ACT 1PL&lt;/strong&gt; but we can&#39;t tell from just that if it&#39;s a &lt;strong&gt;PA-4&lt;/strong&gt; or &lt;strong&gt;PA-5&lt;/strong&gt;. In authors like Galen and Hippocrates we find the &lt;strong&gt;MID 3SG&lt;/strong&gt; κλᾶται which we&#39;ve called &lt;strong&gt;PM-4&lt;/strong&gt;, which strongly suggests it&#39;s a &lt;strong&gt;PA-4&lt;/strong&gt; in the active.&lt;/p&gt;
&lt;p&gt;If that&#39;s the case, we&#39;d expect an &lt;strong&gt;ACT 2SG&lt;/strong&gt; of κλᾷς, an &lt;strong&gt;ACT 3SG&lt;/strong&gt; of κλᾷ, and an &lt;strong&gt;ACT 3PL&lt;/strong&gt; of κλῶσι(ν).&lt;/p&gt;
&lt;p&gt;But in various authors we can find the respective forms κλάεις, κλάει, and κλάουσι.&lt;/p&gt;
&lt;p&gt;This suggests that &lt;strong&gt;α&lt;/strong&gt; plays the same role in &lt;strong&gt;PA-4&lt;/strong&gt; and &lt;strong&gt;PM-4&lt;/strong&gt; as &lt;strong&gt;ε&lt;/strong&gt; did in &lt;strong&gt;PA-2&lt;/strong&gt; and &lt;strong&gt;PM-2&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;For this to work,&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;άω &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;άε &amp;gt; ᾶ&lt;/li&gt;
&lt;li&gt;άει &amp;gt; ᾷ (in the indicative) and ᾶ (in the infinitive)&lt;/li&gt;
&lt;li&gt;άο &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;άου &amp;gt; ῶ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We&#39;ll discuss the άει issue in the next post.&lt;/p&gt;
&lt;p&gt;What about &lt;strong&gt;PA-3&lt;/strong&gt; and &lt;strong&gt;PM-3&lt;/strong&gt;? We&#39;re basically trying to solve for &lt;strong&gt;x&lt;/strong&gt; given:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;xω &amp;gt; ω&lt;/li&gt;
&lt;li&gt;xε &amp;gt; ου&lt;/li&gt;
&lt;li&gt;xει &amp;gt; οι (in the indicative) and ου (in the infinitive)&lt;/li&gt;
&lt;li&gt;xο &amp;gt; ου&lt;/li&gt;
&lt;li&gt;xου &amp;gt; ου&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&#39;s difficult to find examples in the present verb forms of other dialects and texts, but even in the New Testament it&#39;s not difficult to find cases where οε and οο are alternatively spelled ου (e.g. ἀγαθοεργ- in 1 Tim and ἀγαθουργ- in Acts). This makes &lt;strong&gt;ο&lt;/strong&gt; a possible candidate for &lt;strong&gt;x&lt;/strong&gt; and note, in particular, the &lt;strong&gt;ACT 3SG&lt;/strong&gt; forms have so far all been quite transparent in what vowel ends the stem.&lt;/p&gt;
&lt;p&gt;So we appear to have:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;όω &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;όε &amp;gt; οῦ&lt;/li&gt;
&lt;li&gt;όει &amp;gt; οῖ (in the indicative) and οῦ (in the infinitive)&lt;/li&gt;
&lt;li&gt;όο &amp;gt; οῦ&lt;/li&gt;
&lt;li&gt;όου &amp;gt; οῦ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And although a proper argument will get us quite far afield (maybe one day), it turns out &lt;strong&gt;PA-5&lt;/strong&gt; and &lt;strong&gt;PM-5&lt;/strong&gt; can be explained by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ήω &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;ήε &amp;gt; ῆ&lt;/li&gt;
&lt;li&gt;ήει &amp;gt; ῇ (in the indicative) and ῆ (in the infinitive)&lt;/li&gt;
&lt;li&gt;ήο &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;ήου &amp;gt; ῶ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, in summary, the circumflex verbs can be explained through a historical interaction (generally referred to as a contraction) between a vowel at the end of the original stem and the vowel at the start of what is added to it.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;PA-2&lt;/strong&gt; and &lt;strong&gt;PM-2&lt;/strong&gt; come from a stem originally ending in &lt;strong&gt;έ&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PA-3&lt;/strong&gt; and &lt;strong&gt;PM-3&lt;/strong&gt; come from a stem originally ending in &lt;strong&gt;ό&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PA-4&lt;/strong&gt; and &lt;strong&gt;PM-4&lt;/strong&gt; come from a stem originally ending in &lt;strong&gt;ά&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PA-5&lt;/strong&gt; and &lt;strong&gt;PM-5&lt;/strong&gt; come from a stem originally ending in &lt;strong&gt;ή&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Often circumflex verbs are referred to as &lt;strong&gt;contract verbs&lt;/strong&gt; but, while contraction is indeed the historical explanation for how the circumflex verbs got their forms, I like the name &lt;strong&gt;circumflex verbs&lt;/strong&gt; because it describes an actual synchronic characteristic of the verb forms rather than an explanation of how they happened to get like that. It&#39;s interesting that ancient grammarians like Dionysius Thrax called them &lt;strong&gt;perispomenon&lt;/strong&gt; verbs (the term for words with a circumflex on the last syllable) and called &lt;strong&gt;PA-1&lt;/strong&gt;/&lt;strong&gt;PM-1&lt;/strong&gt; verbs &lt;strong&gt;barytone&lt;/strong&gt; verbs (the term for words with NO ACCENT on the last syllable).&lt;/p&gt;
&lt;p&gt;In the next post, we&#39;ll explore why the contraction rules are not random but, in fact, are quite systematic. We&#39;ll also touch on why the contractions don&#39;t seem to work quite the same way in the infinitive.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 6</title>
    <link href="https://jktauber.com/2017/07/11/tour-greek-morphology-part-6/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 6"/>
    <published>2017-07-11</published>
    <updated>2017-07-11</updated>
    <id>https://jktauber.com/2017/07/11/tour-greek-morphology-part-6</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/11/tour-greek-morphology-part-6/">&lt;p&gt;Part six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;Every form we&#39;ve seen of λύω so far starts with &lt;strong&gt;λυ&lt;/strong&gt;, unchanged except for accent. Also, all the forms that start with &lt;strong&gt;λυ&lt;/strong&gt; (or &lt;strong&gt;λύ&lt;/strong&gt;) have been forms of λύω.&lt;/p&gt;
&lt;p&gt;Every form we&#39;ve seen so far that&#39;s active first person plural ends with &lt;strong&gt;μεν&lt;/strong&gt;. Also, all the forms that end with &lt;strong&gt;μεν&lt;/strong&gt; have been active first person plural.&lt;/p&gt;
&lt;p&gt;Put another way, the &lt;strong&gt;λύ&lt;/strong&gt; in λύομεν has nothing to do with being active first person plural and the &lt;strong&gt;μεν&lt;/strong&gt; in λύομεν has nothing to do with being a form of λύ (at least based on every paradigm we&#39;ve seen so far).&lt;/p&gt;
&lt;p&gt;What about the &lt;strong&gt;ο&lt;/strong&gt; in between them? It cannot (at least at the moment) be said to only depend on the fact we have a form of λύω nor can it be said to only depend on the fact we have an active first person plural form. The vowel seems to depend BOTH on the lexical item AND the morphosyntactic properties of voice, person, and number.&lt;/p&gt;
&lt;p&gt;Similarly with ποιεῖτε. The initial &lt;strong&gt;ποι&lt;/strong&gt; indicates and only indicates the lexical item. The final &lt;strong&gt;τε&lt;/strong&gt; indicates and only indicates the active second person plural. The fact we have &lt;strong&gt;εῖ&lt;/strong&gt; rather than &lt;strong&gt;ο&lt;/strong&gt; (or &lt;strong&gt;ε&lt;/strong&gt; or &lt;strong&gt;οῦ&lt;/strong&gt; or any other vowel) is because of BOTH the lexical item and the morphosyntactic properties.&lt;/p&gt;
&lt;p&gt;What is happening here becomes very clear when we look at some older texts or texts in more conservative dialects. For example, in Herodotus, written in the Ionic dialect, we don&#39;t find ποιεῖτε but instead ποιέετε. In fact, here&#39;s what we find:&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;ACT INF&lt;/th&gt;&lt;td&gt;ποιέειν&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ACT 1SG&lt;/th&gt;&lt;td&gt;ποιέω&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ACT 2SG&lt;/th&gt;&lt;td&gt;ποιέεις&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ACT 3SG&lt;/th&gt;&lt;td&gt;ποιέει&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ACT 1PL&lt;/th&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ACT 2PL&lt;/th&gt;&lt;td&gt;ποιέετε&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;th&gt;ACT 3PL&lt;/th&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;There are a couple of things about this that are remarkable. Firstly, if we split off the common part (now &lt;strong&gt;ποιέ&lt;/strong&gt; rather than ποι) then our distinguishers are all IDENTICAL to those of λύω. Secondly, this restores the accent placement to be properly recessive.&lt;/p&gt;
&lt;p&gt;Our ποιῶ and ποιεῖτε are so accented (and not *ποίω or *ποίειτε) because the accent has remained on the same mora (relative to the start) as the older form.&lt;/p&gt;
&lt;p&gt;The vowels are thus explained by noting that historically:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;έει &amp;gt; εῖ&lt;/li&gt;
&lt;li&gt;έω &amp;gt; ῶ&lt;/li&gt;
&lt;li&gt;έε &amp;gt; εῖ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Even without finding the necessary forms in Herodotus, we can infer (assuming the ποιέ is consistent and the distinguishers are those of λύω) the forms missing above and hence the following additional historical vowel changes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;έο &amp;gt; οῦ&lt;/li&gt;
&lt;li&gt;έου &amp;gt; οῦ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And making the same assumption about the middle forms add:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;έῃ &amp;gt; ῇ&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All the &lt;strong&gt;PA-2&lt;/strong&gt; and &lt;strong&gt;PM-2&lt;/strong&gt; endings can now be explained by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the verb-specific common part (the &lt;strong&gt;stem&lt;/strong&gt;) ending in &lt;strong&gt;ε&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;the voice / person / number endings originally being identical to those of λύω&lt;/li&gt;
&lt;li&gt;the six historical vowel changes listed (referred to as &lt;strong&gt;contractions&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the tour&#39;s next post, we&#39;ll see if we can similarly explain the other forms we&#39;ve seen. Then, in a subsequent post, we&#39;ll come back to these vowel changes and see what&#39;s systematic about them.&lt;/p&gt;
&lt;p&gt;I want to close by emphasizing that I am only trying to describe HOW the circumflex verbs came about, not suggest anything about how native speakers processed or generated the contracted forms. As an analogy: it might be &lt;em&gt;interesting&lt;/em&gt; to learn why the English words &lt;em&gt;foot&lt;/em&gt; and &lt;em&gt;feet&lt;/em&gt; are spelled the way they are relative to how they are pronounced but that explanation doesn&#39;t bear much, if any, relation to what&#39;s going on in the minds of native speakers nor is it necessarily of any use to people learning English as a second language. I&#39;ll touch on that again in a few posts time, but you can also read my 2015 post &lt;a href=&#34;/2015/11/19/dangers-reconstructing-too-much-morphophonology/&#34;&gt;The Dangers of Reconstructing Too Much Morphophonology&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Categories of Reader Work</title>
    <link href="https://jktauber.com/2017/07/10/categories-reader-work/" rel="alternate" type="text/html" title="Categories of Reader Work"/>
    <published>2017-07-10</published>
    <updated>2017-07-10</updated>
    <id>https://jktauber.com/2017/07/10/categories-reader-work</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/10/categories-reader-work/">&lt;p&gt;I sometimes get people expressing an interest in my Greek reader work or get asked about the status of my &#34;reader&#34; and I have to ask them to clarify which reader they mean. I thought I might do a quick post where I spell out various &#34;reader&#34; projects I have worked on and am working on.&lt;/p&gt;
&lt;p&gt;My interest in tools for helping read Greek (especially, but by no means only, the New Testament) goes back at least thirteen or fourteen years. In a &lt;a href=&#34;/2004/11/26/programmed-vocabulary-learning-travelling-salesman/&#34;&gt;2004 post&lt;/a&gt; copied over to this blog, I talk about algorithms for ordering vocabulary to accelerate verse coverage. It was around this time I was also working on what became &lt;a href=&#34;http://quisition.com/&#34;&gt;Quisition&lt;/a&gt;, a flashcard site with spaced repetition.&lt;/p&gt;
&lt;p&gt;In November 2005, I registered the domain &lt;code&gt;readjohn.com&lt;/code&gt; with a view to building a site to help people learn Greek by reading through John&#39;s gospel. The reason for John was not only the simplicity of its Greek but the fact it&#39;s the one thing I had the OpenText analysis for at the time. As proof I had more than just the GNT in mind, I point out that I registered &lt;code&gt;readhomer.com&lt;/code&gt; just two months later. I wasn&#39;t just thinking Greek either, as I registered &lt;code&gt;readdante.com&lt;/code&gt; at the same time.&lt;/p&gt;
&lt;p&gt;Vocabulary was just an initial part of the model of what it takes to be able to read a text. It happens to be the easiest to model because all it takes, to first approximation, is a lemmatized text. But it illustrates the basic concept: if you model what is needed to read a text and you model what a student knows, you can:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;help order texts (including individual clauses or even phrases) in a way that&#39;s appropriate to the student&#39;s level&lt;/li&gt;
&lt;li&gt;appropriately scaffold the texts with just enough information to fill in the gap in their understanding&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;One thing I was experimenting with for scaffolding was inlining Greek that the student could understand (according to the ordering generated by my vocabulary algorithms) in larger text kept in English. So in the first lesson, the student might be given something like John 1.41 in this form:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;He first found his own brother Simon καὶ λέγει αὐτῷ, &#34;We have found the Messiah!&#34;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The combination of vocabulary ordering algorithms (driven by clause-level analysis of John&#39;s gospel) with this sort of inlining I was calling a &lt;strong&gt;New Kind of Graded Reader&lt;/strong&gt; and you can find a lot of posts from around March 2008 on this blog about it including &lt;a href=&#34;/2008/02/10/new-kind-graded-reader/&#34;&gt;this video&lt;/a&gt;. I subsequently did &lt;a href=&#34;/2010/03/28/my-bibletech-2010-talk/&#34;&gt;a full-length talk at BibleTech 2010&lt;/a&gt;. There&#39;s also &lt;a href=&#34;/2010/04/25/inline-replacement-john-2/&#34;&gt;a post with an extended example of the inlining approach&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That initial category of reader work is still alive and by no means abandonded, it&#39;s just taking a long time to get the analysis broadened to take into account not just vocabulary but inflectional morphology, lexical relatedness, syntactic constructions, etc. In fact, a large part of my linguistic analysis work is motivated by the reader work (which was a big theme of &lt;a href=&#34;/2015/05/06/my-bibletech-2015-talk/&#34;&gt;my BibleTech 2015 talk&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;second&lt;/em&gt;, somewhat independent (although still very much corpus-driven and using much of the same machine-actionable linguistic data) reader project was the semi-automated generation of more traditional print readers (the sort with rarer words glossed in footnotes and perhaps more obscure syntactic constructions or idioms commented on). You can read more about it in &lt;a href=&#34;/2015/11/07/generating-readers/&#34;&gt;this post&lt;/a&gt;. One aim with the &lt;strong&gt;semi-automatic generation of printed readers&lt;/strong&gt; was being able to customize them quite easily to a particular level. The scaffolding wouldn&#39;t necessarily be adaptive but it could be personalized.&lt;/p&gt;
&lt;p&gt;Again this is still of great interest to me and motivates a lot of work on machine-actionable data. While I might experiment with approaches other than using TeX, I still want to do more in this area, most likely collaborating with people interested in particular texts (and able to help work on glosses and syntactic commentary).&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;third&lt;/em&gt; category of work is a loose collection of various little prototypes over the years for ways of presenting information in a reader. This includes things like interlinears, colour-coded texts, various ways of showing dependency relations, etc. Brian Rosner and I consolidated these prototypes in a &lt;strong&gt;framework for generating static HTML files&lt;/strong&gt; in &lt;a href=&#34;https://github.com/jtauber/online-reader&#34;&gt;https://github.com/jtauber/online-reader&lt;/a&gt;. There are various online demos linked in the README.&lt;/p&gt;
&lt;p&gt;That repo &lt;em&gt;did&lt;/em&gt; initially include a dynamic reading environment written in Vue.js but that was broken out as the starting point for DeepReader (see below).&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;fourth&lt;/em&gt; category of work (which goes back to my vision for readjohn.com, readhomer.com and readdante.com when I registered the domains) is an &lt;strong&gt;online adaptive reading environment with integrated learning tools&lt;/strong&gt;. I talked about this at SBL 2016 in San Antonio, a Global Philology workshop in Leipzig in May, and I will be talking about it at SBL International 2017 in Berlin next month.&lt;/p&gt;
&lt;p&gt;The idea is to integrate vocabulary and morphological drills with the reading environment so the text drives what to drill, the results of the drills help determine the text, the scaffolding needed, etc.&lt;/p&gt;
&lt;p&gt;So the adaptive reading environment will model:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;what&#39;s needed to understand an upcoming passage&lt;/li&gt;
&lt;li&gt;what the student has already seen&lt;/li&gt;
&lt;li&gt;what the student has inquired about&lt;/li&gt;
&lt;li&gt;what is at an optimal recall interval&lt;/li&gt;
&lt;li&gt;what the student is good or not so good at understanding (based on explicit assessment including meta-cognitive questions)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is what I&#39;m most actively working on at the moment. As with the other categories of readers, it relies heavily on linguistic resources so I&#39;m doing a lot in that area.&lt;/p&gt;
&lt;p&gt;From an &lt;em&gt;implementation&lt;/em&gt; point-of-view, this is being implemented as a Vue.js-based application running in the browser talking to a range of microservices on the backend. Much of the &#34;heavy lifting&#34; will be done by the microservices. The generic parts of the frontend application are being broken out by Brian and me as a framework called DeepReader which could be used for all sorts of readers (even just Kindle-style EPUB readers). I&#39;ll have a lot more to say about DeepReader in the future as well as the specific application of it to building an adaptive reading environment for Greek.&lt;/p&gt;
&lt;p&gt;So there are really four distinct categories of reader projects that I&#39;ve been working on on and off for the last thirteen or fourteen years:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a &#34;New Kind of Graded Reader&#34;&lt;/li&gt;
&lt;li&gt;semi-automatic generation of printed readers&lt;/li&gt;
&lt;li&gt;framework for generating static HTML files&lt;/li&gt;
&lt;li&gt;online adaptive reading environment with integrated learning tools&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;They are all related in that they build on the same linguistic data (which is where most of the effort actually goes).&lt;/p&gt;
&lt;p&gt;Hopefully all that provides a little bit of a high-level guide to all the reading stuff talked about on this blog, on Twitter, and which is implemented in various repositories on GitHub.&lt;/p&gt;
&lt;p&gt;I should stress none of the code is specific to the New Testament or even to Greek. I&#39;d be happy to collaborate with anyone on producing the necessary linguistic data for other texts and other languages.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I sometimes get people expressing an interest in my Greek reader work or get asked about the status of my &#34;reader&#34; and I have to ask them to clarify which reader they mean. I thought I might do a quick post where I spell out various &#34;reader&#34; projects I have worked on and am working on.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 5</title>
    <link href="https://jktauber.com/2017/07/06/tour-greek-morphology-part-5/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 5"/>
    <published>2017-07-06</published>
    <updated>2017-07-06</updated>
    <id>https://jktauber.com/2017/07/06/tour-greek-morphology-part-5</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/06/tour-greek-morphology-part-5/">&lt;p&gt;Part five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/2017/07/02/tour-greek-morphology-part-4/&#34;&gt;part four&lt;/a&gt;, we introduced the &lt;strong&gt;circumflex verbs&lt;/strong&gt; in the present active. Now we&#39;re going to look at their middle forms.&lt;/p&gt;
&lt;p&gt;Here they are alongside the middle of λύω:&lt;/p&gt;
&lt;p&gt;| INF | λύεσθαι     | ποιεῖσθαι     | δηλοῦσθαι | τιμᾶσθαι | χρῆσθαι |
| 1SG | λύομαι      | ποιοῦμαι      | δηλοῦμαι  | τιμῶμαι  | χρῶμαι  |
| 2SG | λύῃ or λύει | ποιῇ or ποιεῖ | δηλοῖ     | τιμᾷ     | χρῇ     |
| 3SG | λύεται      | ποιεῖται      | δηλοῦται  | τιμᾶται  | χρῆται  |
| 1PL | λυόμεθα     | ποιούμεθα     | δηλούμεθα | τιμώμεθα | χρώμεθα |
| 2PL | λύεσθε      | ποιεῖσθε      | δηλοῦσθε  | τιμᾶσθε  | χρῆσθε  |
| 3PL | λύονται     | ποιοῦνται     | δηλοῦνται | τιμῶνται | χρῶνται |&lt;/p&gt;
&lt;p&gt;As you can see, the circumflex pervades except in the &lt;strong&gt;1PL&lt;/strong&gt; where the law of limitation prohibits it. This is also the one place the λύω accent is on the distinguisher.&lt;/p&gt;
&lt;p&gt;Note also that, as was the case with the active, the forms in each row essentially have the same endings just with a vowel change.&lt;/p&gt;
&lt;p&gt;Here are the common elements of each row of the distinguisher in both the active and middle:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;active&lt;/th&gt;
&lt;th&gt;middle&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;-ν&lt;/td&gt;
&lt;td&gt;-σθαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-μαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;-{ι}ς&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;-{ι}&lt;/td&gt;
&lt;td&gt;-ται&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;-μεν&lt;/td&gt;
&lt;td&gt;-μεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;-τε&lt;/td&gt;
&lt;td&gt;-σθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;-σι(ν)&lt;/td&gt;
&lt;td&gt;-νται&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The iota in the &lt;strong&gt;2SG&lt;/strong&gt; active and middle and the &lt;strong&gt;3SG&lt;/strong&gt; active is questionable because we&#39;re splitting a diphthong but we&#39;ll return to that in another post.&lt;/p&gt;
&lt;p&gt;The vowels prior to this common element seem to change as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;if the distinguisher has a monophthong ε in λύω,&lt;br&gt;it will have ει, ου, α, η in the other paradigms&lt;/li&gt;
&lt;li&gt;if the distinguisher has a monophthong ο in λύω,&lt;br&gt;it will have ου, ου, ω, ω in the other paradigms&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This applies to the active too (although the diphthongs there are found in more cells of the λύω paradigm).&lt;/p&gt;
&lt;p&gt;We&#39;ll explore this more in the next post.&lt;/p&gt;
&lt;p&gt;Before we end this one, though, let&#39;s label the paradigms for our present middle distinguishers:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PM-1&lt;/th&gt;
&lt;th&gt;PM-2&lt;/th&gt;
&lt;th&gt;PM-3&lt;/th&gt;
&lt;th&gt;PM-4&lt;/th&gt;
&lt;th&gt;PM-5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;Xεσθαι&lt;/td&gt;
&lt;td&gt;Xεῖσθαι&lt;/td&gt;
&lt;td&gt;Xοῦσθαι&lt;/td&gt;
&lt;td&gt;Xᾶσθαι&lt;/td&gt;
&lt;td&gt;Xῆσθαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xομαι&lt;/td&gt;
&lt;td&gt;Xοῦμαι&lt;/td&gt;
&lt;td&gt;Xοῦμαι&lt;/td&gt;
&lt;td&gt;Xῶμαι&lt;/td&gt;
&lt;td&gt;Xῶμαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xῃ or Xει&lt;/td&gt;
&lt;td&gt;Xῇ or Xεῖ&lt;/td&gt;
&lt;td&gt;Xοῖ&lt;/td&gt;
&lt;td&gt;Xᾷ&lt;/td&gt;
&lt;td&gt;Xῇ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xεται&lt;/td&gt;
&lt;td&gt;Xεῖται&lt;/td&gt;
&lt;td&gt;Xοῦται&lt;/td&gt;
&lt;td&gt;Xᾶται&lt;/td&gt;
&lt;td&gt;Xῆται&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xούμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;td&gt;Xώμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;td&gt;Xεῖσθε&lt;/td&gt;
&lt;td&gt;Xοῦσθε&lt;/td&gt;
&lt;td&gt;Xᾶσθε&lt;/td&gt;
&lt;td&gt;Xῆσθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xονται&lt;/td&gt;
&lt;td&gt;Xοῦνται&lt;/td&gt;
&lt;td&gt;Xοῦνται&lt;/td&gt;
&lt;td&gt;Xῶνται&lt;/td&gt;
&lt;td&gt;Xῶνται&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Notice that the &lt;strong&gt;1SG&lt;/strong&gt;, &lt;strong&gt;1PL&lt;/strong&gt;, and &lt;strong&gt;3PL&lt;/strong&gt; distinguishers are identical for &lt;strong&gt;PM-2&lt;/strong&gt; vs &lt;strong&gt;PM-3&lt;/strong&gt; and for &lt;strong&gt;PM-4&lt;/strong&gt; vs &lt;strong&gt;PM-5&lt;/strong&gt;. This was similar to what we saw in the active case (although there, the &lt;strong&gt;1SG&lt;/strong&gt; was even less helpful in identifying the paradigm).&lt;/p&gt;
&lt;p&gt;Notice also that these are exactly the rows where the distinguisher in λύω starts with an omicron.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 4</title>
    <link href="https://jktauber.com/2017/07/02/tour-greek-morphology-part-4/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 4"/>
    <published>2017-07-02</published>
    <updated>2017-07-02</updated>
    <id>https://jktauber.com/2017/07/02/tour-greek-morphology-part-4</id>
    <content type="html" xml:base="https://jktauber.com/2017/07/02/tour-greek-morphology-part-4/">&lt;p&gt;Part four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the &lt;a href=&#34;/2017/06/29/tour-greek-morphology-part-3/&#34;&gt;previous part&lt;/a&gt; we saw that more than half of the verb lexemes in the NT appearing in the present indicative follow the exact pattern of λύω, i.e. &lt;strong&gt;PA-1&lt;/strong&gt; in the active and &lt;strong&gt;PM-1&lt;/strong&gt; in the middle. In the next few parts to this series, we&#39;re going to look at some of the verbs that do NOT.&lt;/p&gt;
&lt;p&gt;Here&#39;s our first example, placed alongside λύω for comparison (a paradigm of paradigms again):&lt;/p&gt;
&lt;p&gt;| INF | λύειν     | ποιεῖν     |
| 1SG | λύω       | ποιῶ       |
| 2SG | λύεις     | ποιεῖς     |
| 3SG | λύει      | ποιεῖ      |
| 1PL | λύομεν    | ποιοῦμεν   |
| 2PL | λύετε     | ποιεῖτε    |
| 3PL | λύουσι(ν) | ποιοῦσι(ν) |&lt;/p&gt;
&lt;p&gt;Look closely at each pair on a row and notice a few things:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in the infinitive, in all singulars, and in the third plural, the distinguishers are identical EXCEPT for accent&lt;/li&gt;
&lt;li&gt;in the first and second plurals, the only other difference is ου vs ο and ει vs ε&lt;/li&gt;
&lt;li&gt;whereas λύω never has the accent on the distinguisher, the seven forms of ποιῶ above ALWAYS do and it is always a circumflex&lt;/li&gt;
&lt;li&gt;the accent is not strictly recessive the way it is in λύω and &lt;strong&gt;PA-1&lt;/strong&gt; verbs in general&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We are going to call this new pattern &lt;strong&gt;PA-2&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;There are many other verbs that follow the &lt;strong&gt;PA-2&lt;/strong&gt; pattern and yet others that are quite similar but with small differences.&lt;/p&gt;
&lt;p&gt;Here are some examples placed side-by-side with λύω and ποιῶ:&lt;/p&gt;
&lt;p&gt;| INF | λύειν     | ποιεῖν     | δηλοῦν     | τιμᾶν     | ζῆν     |
| 1SG | λύω       | ποιῶ       | δηλῶ       | τιμῶ      | ζῶ      |
| 2SG | λύεις     | ποιεῖς     | δηλοῖς     | τιμᾷς     | ζῇς     |
| 3SG | λύει      | ποιεῖ      | δηλοῖ      | τιμᾷ      | ζῇ      |
| 1PL | λύομεν    | ποιοῦμεν   | δηλοῦμεν   | τιμῶμεν   | ζῶμεν   |
| 2PL | λύετε     | ποιεῖτε    | δηλοῦτε    | τιμᾶτε    | ζῆτε    |
| 3PL | λύουσι(ν) | ποιοῦσι(ν) | δηλοῦσι(ν) | τιμῶσι(ν) | ζῶσι(ν) |&lt;/p&gt;
&lt;p&gt;It will be clearer to see the similarities and differences by just showing the distinguishers.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PA-1&lt;/th&gt;
&lt;th&gt;PA-2&lt;/th&gt;
&lt;th&gt;PA-3&lt;/th&gt;
&lt;th&gt;PA-4&lt;/th&gt;
&lt;th&gt;PA-5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;Xειν&lt;/td&gt;
&lt;td&gt;Xεῖν&lt;/td&gt;
&lt;td&gt;Xοῦν&lt;/td&gt;
&lt;td&gt;Xᾶν&lt;/td&gt;
&lt;td&gt;Xῆν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xω&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;td&gt;Xῶ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xεῖς&lt;/td&gt;
&lt;td&gt;Xοῖς&lt;/td&gt;
&lt;td&gt;Xᾷς&lt;/td&gt;
&lt;td&gt;Xῇς&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xεῖ&lt;/td&gt;
&lt;td&gt;Xοῖ&lt;/td&gt;
&lt;td&gt;Xᾷ&lt;/td&gt;
&lt;td&gt;Xῇ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xοῦμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;td&gt;Xῶμεν&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xεῖτε&lt;/td&gt;
&lt;td&gt;Xοῦτε&lt;/td&gt;
&lt;td&gt;Xᾶτε&lt;/td&gt;
&lt;td&gt;Xῆτε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xουσι(ν)&lt;/td&gt;
&lt;td&gt;Xοῦσι(ν)&lt;/td&gt;
&lt;td&gt;Xοῦσι(ν)&lt;/td&gt;
&lt;td&gt;Xῶσι(ν)&lt;/td&gt;
&lt;td&gt;Xῶσι(ν)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;I&#39;ve given each of these patterns a label: &lt;strong&gt;PA-3&lt;/strong&gt;, &lt;strong&gt;PA-4&lt;/strong&gt;, &lt;strong&gt;PA-5&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;All four of the new patterns have circumflex accents on the distinguisher in every cell. For this reason we will call these verbs &lt;strong&gt;circumflex&lt;/strong&gt; verbs.&lt;/p&gt;
&lt;p&gt;Notice that in &lt;strong&gt;1SG&lt;/strong&gt;, the distinguisher is identical across all the circumflex verbs (-ῶ). What that means is, given just the &lt;strong&gt;1SG&lt;/strong&gt; form of a circumflex verb, you can&#39;t tell exactly which of the patterns will be followed overall. Xῶ could be &lt;strong&gt;PA-2&lt;/strong&gt;, &lt;strong&gt;PA-3&lt;/strong&gt;, &lt;strong&gt;PA-4&lt;/strong&gt; OR &lt;strong&gt;PA-5&lt;/strong&gt;. You CAN tell, however, that it&#39;s not a &lt;strong&gt;PA-1&lt;/strong&gt; verb (because of the circumflex).&lt;/p&gt;
&lt;p&gt;In contrast to &lt;strong&gt;1SG&lt;/strong&gt;, if you know ANY of the &lt;strong&gt;INF&lt;/strong&gt;, the &lt;strong&gt;2SG&lt;/strong&gt;, the &lt;strong&gt;3SG&lt;/strong&gt;, or the &lt;strong&gt;2PL&lt;/strong&gt;, you can tell exactly which pattern is being followed.&lt;/p&gt;
&lt;p&gt;That leaves the interesting case of the &lt;strong&gt;1PL&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt;. An ου in either cell distinguisher means we have a &lt;strong&gt;PA-2&lt;/strong&gt; or &lt;strong&gt;PA-3&lt;/strong&gt; but don&#39;t know which. An ω in either cell distinguisher means we have a &lt;strong&gt;PA-4&lt;/strong&gt; or &lt;strong&gt;PA-5&lt;/strong&gt; but don&#39;t know which.&lt;/p&gt;
&lt;p&gt;Put another way: presented with a &lt;strong&gt;1PL&lt;/strong&gt; ending in -οῦμεν, we can tell (at least given what we&#39;ve see up until this point) what the &lt;strong&gt;1SG&lt;/strong&gt; and &lt;strong&gt;3PL&lt;/strong&gt; must be but we&#39;re left with two possibilities for all the other cells. The moment we know just one of those OTHER cells, though, we can tell what every cell must be.&lt;/p&gt;
&lt;p&gt;We&#39;ll continue to explore these new patterns (and their corresponding middle patterns) over the next few posts.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Collapsible Treedown</title>
    <link href="https://jktauber.com/2017/06/30/collapsible-treedown/" rel="alternate" type="text/html" title="Collapsible Treedown"/>
    <published>2017-06-30</published>
    <updated>2017-06-30</updated>
    <id>https://jktauber.com/2017/06/30/collapsible-treedown</id>
    <content type="html" xml:base="https://jktauber.com/2017/06/30/collapsible-treedown/">&lt;p&gt;Jonathan Robie&#39;s Treedown format is a really nice way of conveying basic syntactic structure in real texts. I recently experimented a little with some code for collapsing and expanding of the structure.&lt;/p&gt;
&lt;p&gt;You can read about &lt;a href=&#34;http://jonathanrobie.biblicalhumanities.org/blog/2017/05/12/lowfat-treebanks-visualizing/&#34;&gt;Treedown in more detail&lt;/a&gt; but the idea is to convey structure in a plain text format that still conveys meaning. The name &#34;treedown&#34; is a nod to &#34;markdown&#34; and the philosophy is very similar—convey information visually but in a way that&#39;s easy to transmit and edit in plain text.&lt;/p&gt;
&lt;p&gt;One of the things that appeals to me about Treedown is how easily it can be used to just initially sketch out high level argument structure without getting into the weeds. But even if the analysis does go a little deeper, you want to be able to pull back and see the high-level structure without getting too much in the way of just reading. So to this end, I hacked together a bit of HTML, CSS and JS to demonstrate some UI to support this &#34;collapsibility&#34;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/images/treedown_21.gif&#34; width=&#34;100%&#34;&gt;&lt;/p&gt;
&lt;p&gt;This is just plain Treedown (or one proposal for it—it&#39;s still a work in progress) but with some lightweight interactivity that lets the reader determine how much structure they want to see. Square brackets around the Treedown label indicates a further analysis that can be expanded.&lt;/p&gt;
&lt;p&gt;I made a variant that lets you get a &#34;preview&#34; of the next level of structure when you hover over it, using labelled brackets:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/images/treedown_22.gif&#34; width=&#34;100%&#34;&gt;&lt;/p&gt;
&lt;p&gt;I then thought that perhaps this preview might be better conveyed just with colour, where each Treedown label gets its own colour. Here&#39;s what that might look like:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/images/treedown_23.gif&#34; width=&#34;100%&#34;&gt;&lt;/p&gt;
&lt;p&gt;There are all just quick prototypes but let me know what you think.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Jonathan Robie&#39;s Treedown format is a really nice way of conveying basic syntactic structure in real texts. I recently experimented a little with some code for collapsing and expanding of the structure.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 3</title>
    <link href="https://jktauber.com/2017/06/29/tour-greek-morphology-part-3/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 3"/>
    <published>2017-06-29</published>
    <updated>2017-06-29</updated>
    <id>https://jktauber.com/2017/06/29/tour-greek-morphology-part-3</id>
    <content type="html" xml:base="https://jktauber.com/2017/06/29/tour-greek-morphology-part-3/">&lt;p&gt;Part three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the first two parts (&lt;a href=&#34;/2017/06/23/tour-greek-morphology-part-1/&#34;&gt;part one&lt;/a&gt; and &lt;a href=&#34;/2017/06/25/tour-greek-morphology-part-2/&#34;&gt;part two&lt;/a&gt;), we looked at the present indicative forms of λύω.&lt;/p&gt;
&lt;p&gt;I want to now add the infinitives, λύειν (for the active) and λύεσθαι (for the middle).&lt;/p&gt;
&lt;p&gt;So we now have:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;active&lt;/th&gt;
&lt;th&gt;middle&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;λύειν&lt;/td&gt;
&lt;td&gt;λύεσθαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;λύω&lt;/td&gt;
&lt;td&gt;λύομαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;λύεις&lt;/td&gt;
&lt;td&gt;λύῃ or λύει&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;λύει&lt;/td&gt;
&lt;td&gt;λύεται&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;λύομεν&lt;/td&gt;
&lt;td&gt;λυόμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;λύετε&lt;/td&gt;
&lt;td&gt;λύεσθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;λύουσι(ν)&lt;/td&gt;
&lt;td&gt;λύονται&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Adding the infinitives does make certain commonalities jump out even more: all the &#39;ει&#39; in the active and both the &#39;αι&#39; and &#39;(σ)θ&#39; in the middle.&lt;/p&gt;
&lt;p&gt;But one of the big questions to address next is: does any of this have anything to do with the present indicative (and infinitive) forms of any &lt;em&gt;other&lt;/em&gt; words besides λύω?&lt;/p&gt;
&lt;p&gt;Fortunately (otherwise it might not have been the best of starting places) it &lt;strong&gt;does&lt;/strong&gt;. In the MorphGNT, there are 645 distinct lexemes appearing in the present indicative and 383 of them follow &lt;em&gt;exactly&lt;/em&gt; the same pattern as λύω above including the accentuation.&lt;/p&gt;
&lt;p&gt;In the present active indicative, there are 10 verbs that exhibit all six cells in the paradigm: θέλω, ἀκούω, λέγω, μένω, λαμβάνω, γινώσκω, πιστεύω, μέλλω, ἔχω, βλέπω (note that λύω is not, in fact, among them).&lt;/p&gt;
&lt;p&gt;In the middle, there are no words filling all six cells in the MorphGNT but there are five verbs that fill five of the cells: βούλομαι, λογίζομαι, ἔρχομαι, ἐργάζομαι, προσεύχομαι.&lt;/p&gt;
&lt;p&gt;But allowing for the missing cells, 271 lexemes follow this active pattern in the present indicative and 160 lexemes follow this middle pattern (with overlap in the case of lexemes that have both active and middle forms):&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;active&lt;/th&gt;
&lt;th&gt;middle&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;INF&lt;/td&gt;
&lt;td&gt;Xειν&lt;/td&gt;
&lt;td&gt;Xεσθαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1SG&lt;/td&gt;
&lt;td&gt;Xω&lt;/td&gt;
&lt;td&gt;Xομαι&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2SG&lt;/td&gt;
&lt;td&gt;Xεις&lt;/td&gt;
&lt;td&gt;Xῃ or Xει&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3SG&lt;/td&gt;
&lt;td&gt;Xει&lt;/td&gt;
&lt;td&gt;Xεται&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1PL&lt;/td&gt;
&lt;td&gt;Xομεν&lt;/td&gt;
&lt;td&gt;Xόμεθα&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2PL&lt;/td&gt;
&lt;td&gt;Xετε&lt;/td&gt;
&lt;td&gt;Xεσθε&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3PL&lt;/td&gt;
&lt;td&gt;Xουσι(ν)&lt;/td&gt;
&lt;td&gt;Xονται&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The accent is recessive in every case so will be an acute on the right-most syllable of X in every case but Xόμεθα where the law of limitation means the accent can&#39;t go back as far as X. I could skip accents altogether but they&#39;ll turn out to be very important in the next few posts so it&#39;s actually helpful to include them in this template where they fall on the distinguisher (the part other than the X that varies from cell to cell). And note that if the distinguisher doesn&#39;t have an accent in the template it&#39;s because it doesn&#39;t have the accent in the full form.&lt;/p&gt;
&lt;p&gt;I&#39;m going to call the active and middle pattern above &lt;strong&gt;PA-1&lt;/strong&gt; and &lt;strong&gt;PM-1&lt;/strong&gt; respectively.&lt;/p&gt;
&lt;p&gt;We must avoid the temptation to talk of stems at this point. Even though X above does correspond to what&#39;s normally thought of as the stem, we will encounter many paradigm templates (including in the next few posts in this series) where that is not the case and it&#39;s better to be precise and avoid confusion from the start.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 2</title>
    <link href="https://jktauber.com/2017/06/25/tour-greek-morphology-part-2/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 2"/>
    <published>2017-06-25</published>
    <updated>2017-06-25</updated>
    <id>https://jktauber.com/2017/06/25/tour-greek-morphology-part-2</id>
    <content type="html" xml:base="https://jktauber.com/2017/06/25/tour-greek-morphology-part-2/">&lt;p&gt;Part two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).&lt;/p&gt;
&lt;p&gt;In the &lt;a href=&#34;/2017/06/23/tour-greek-morphology-part-1/&#34;&gt;first part&lt;/a&gt; we took an initial look at the present active indicative paradigm for λύω, repeated below for easy reference:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;λύω&lt;/li&gt;
&lt;li&gt;λύεις&lt;/li&gt;
&lt;li&gt;λύει&lt;/li&gt;
&lt;li&gt;λύομεν&lt;/li&gt;
&lt;li&gt;λύετε&lt;/li&gt;
&lt;li&gt;λύουσι(ν)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are a number of morphsyntactic properties we could alter to see the effect on the paradigm, but in this post, we&#39;ll look at the middle voice:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;λύομαι&lt;/li&gt;
&lt;li&gt;λύῃ or λύει&lt;/li&gt;
&lt;li&gt;λύεται&lt;/li&gt;
&lt;li&gt;λυόμεθα&lt;/li&gt;
&lt;li&gt;λύεσθε&lt;/li&gt;
&lt;li&gt;λύονται&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So again, we&#39;re showing, side-by-side, the various number-person forms for λύω, keeping the tense, aspect, voice, and mood constant. In this way we can see, by comparing the paradigms (a paradigm of paradigms!), how the active/middle alternation is realized in Greek (at least for the present indicative λύω!)&lt;/p&gt;
&lt;p&gt;A few things may immediately jump out at you:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the forms continue to all start with λυ&lt;/li&gt;
&lt;li&gt;the υ is always followed by a vowel (and mostly ε or ο)&lt;/li&gt;
&lt;li&gt;the second person singular has two possible forms&lt;/li&gt;
&lt;li&gt;three of the forms end in -αι&lt;/li&gt;
&lt;li&gt;both the first person forms have a μ and both the third person forms have a τ&lt;/li&gt;
&lt;li&gt;the first and second plural both have a θ and there seems to be more of a link between the active and middle forms (&lt;strong&gt;ομε&lt;/strong&gt;ν/&lt;strong&gt;ομε&lt;/strong&gt;θα, &lt;strong&gt;ε&lt;/strong&gt;τ&lt;strong&gt;ε&lt;/strong&gt;/&lt;strong&gt;ε&lt;/strong&gt;σθ&lt;strong&gt;ε&lt;/strong&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We have to be careful not to make too much of some of these yet. Many a bad linguistic analysis has come from noticing patterns in a small number of instances without seeing if the same pattern applies more broadly! We need more data. But these initial observations are at least things to keep in the backs of our minds as we explore more forms. Some of them will prove particularly interesting later on.&lt;/p&gt;
&lt;p&gt;For now I just want to explore the two second person singular forms, λύῃ and λύει. You&#39;ll notice one of these forms is identical to the third singular active form. Isn&#39;t this potentially confusing?&lt;/p&gt;
&lt;p&gt;Yes, but there are two things to note here: one, it should generally be clear from the context, regardless of the ending, whether a third person active or second person middle is intended. Ambiguities in morphology like this are far more likely in cases where &lt;em&gt;multiple&lt;/em&gt; morphsyntactic properties vary at once (in this case both person AND voice) and where the larger context is likely to make clear which alternative is meant. It&#39;s worth also noting, for example, that -ει can also end a dative noun (and in fact does in over 300 cases in the NT).&lt;/p&gt;
&lt;p&gt;Two, the -ῃ forms are much more common in the NT than the -ει and, in fact, there&#39;s actually only one second person -ει form in the SBLGNT text and it is βούλει where &lt;em&gt;lexically&lt;/em&gt; the word must be middle anyway and so even the context isn&#39;t needed to disambiguate.&lt;/p&gt;
&lt;p&gt;As to why two forms developed in the first place, we&#39;ll have to wait a bit to discuss that.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Part two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).</summary>
  </entry><entry>
    <title type="html">Another European Trip</title>
    <link href="https://jktauber.com/2017/06/25/another-european-trip/" rel="alternate" type="text/html" title="Another European Trip"/>
    <published>2017-06-25</published>
    <updated>2017-06-25</updated>
    <id>https://jktauber.com/2017/06/25/another-european-trip</id>
    <content type="html" xml:base="https://jktauber.com/2017/06/25/another-european-trip/">&lt;p&gt;I was here last month but I&#39;m back again for a series of conferences and then my graduation.&lt;/p&gt;
&lt;p&gt;Last week I attended the inaugural &lt;strong&gt;Language, Data and Knowledge&lt;/strong&gt; conference in Galway, Ireland including the OntoLex Model Workshop which preceded it. The LDK conference was a nice intersection of linguistics and linked data very much in the spirit of the work described on this website. I got to met a few people I&#39;ve known of for a while as well as meet some new people I hope to stay in touch with and potentially collaborate with. The conference will be biennial, with the next one in Leipzig. I definitely plan to submit something for that one!&lt;/p&gt;
&lt;p&gt;Then I attended &lt;strong&gt;VueConf&lt;/strong&gt; in Wrocław, Poland. Vue.js is the JavaScript framework I&#39;m using for my online reading environment work and the timing turned out perfectly for me to attend. I gave a lightning talk on the DeepReader project (which I&#39;ll also blog about here soon).&lt;/p&gt;
&lt;p&gt;I&#39;m currently in Leipzig just to visit some people at the Humboldt Chair of Digital Humanities again.&lt;/p&gt;
&lt;p&gt;Then I&#39;m heading to Cambridge for the &lt;strong&gt;Tyndale House Workshop in Greek Prepositions&lt;/strong&gt;. Looking forward to seeing a lot of my friends there and having some good discussions, not just about the topic at hand but more broadly as well.&lt;/p&gt;
&lt;p&gt;Then I&#39;m heading to Lampeter, Wales for my graduation on July 7th. Three years ago, I decided that it might be useful for me to have a qualification in Classical Greek as well as in linguistics and so I started pursuing a postgraduate diploma at the University of Wales Trinity Saint David. Two days ago, I found out I&#39;m being awarded the diploma &lt;em&gt;with Distinction&lt;/em&gt; which was my unspoken hope despite occasionally doing poorly at my unseen translations.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I was here last month but I&#39;m back again for a series of conferences and then my graduation.</summary>
  </entry><entry>
    <title type="html">A Tour of Greek Morphology: Part 1</title>
    <link href="https://jktauber.com/2017/06/23/tour-greek-morphology-part-1/" rel="alternate" type="text/html" title="A Tour of Greek Morphology: Part 1"/>
    <published>2017-06-23</published>
    <updated>2017-06-23</updated>
    <id>https://jktauber.com/2017/06/23/tour-greek-morphology-part-1</id>
    <content type="html" xml:base="https://jktauber.com/2017/06/23/tour-greek-morphology-part-1/">&lt;p&gt;This is the first post in a (likely long) series exploring the inflectional morphology of Greek. My goal is to work through various aspects of Greek morphology to help students think more systematically about the subject.&lt;/p&gt;
&lt;p&gt;I ultimately hope to cover everything that a beginner-intermediate grammar might but in a much more exploratory fashion. I&#39;ll occasionally touch on morphological theory but I mostly want to point out phenomena in the language that students have already seen but perhaps have not thought about in any depth.&lt;/p&gt;
&lt;p&gt;We&#39;ll start with a paradigm familiar to all students of New Testament Greek:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;λύω&lt;/li&gt;
&lt;li&gt;λύεις&lt;/li&gt;
&lt;li&gt;λύει&lt;/li&gt;
&lt;li&gt;λύομεν&lt;/li&gt;
&lt;li&gt;λύετε&lt;/li&gt;
&lt;li&gt;λύουσι(ν)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At its most basic, a &lt;strong&gt;paradigm&lt;/strong&gt; is just a showing of related forms next to one another for comparison. The idea is to get a sense of how forms and meaning relate by showing contrastive examples.&lt;/p&gt;
&lt;p&gt;In most cases, there&#39;s something held constant across all the cells. In the list above, all the forms are present active indicative forms of the word λύω. What distinguishes them from the point of view of their &lt;strong&gt;morphosyntactic properties&lt;/strong&gt; is the person and number.&lt;/p&gt;
&lt;p&gt;Respectively the list above is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the first person singular (present active indicative form of the word λύω)&lt;/li&gt;
&lt;li&gt;the second person singular&lt;/li&gt;
&lt;li&gt;the third person singular&lt;/li&gt;
&lt;li&gt;the first person plural&lt;/li&gt;
&lt;li&gt;the second person plural&lt;/li&gt;
&lt;li&gt;the third person plural&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It may not be the case that the &lt;em&gt;forms&lt;/em&gt; all have something in common, although in this case you can see they all start with λύ. It may be tempting to make the simple analysis that λύ itself means &#34;the present active indicative form of the word λύω&#34; and, say, εις means &#34;the second person singular&#34;. But as we shall see, that&#39;s not the most helpful analysis in general.&lt;/p&gt;
&lt;p&gt;It&#39;s worth thinking about other possibilities we could draw from just this tiny example (even though many theories will be ruled out once we look at other data): perhaps λ indicates indicative; perhaps εις indicates not only second person singular but present active too; perhaps εις is only used if the word starts with an λ.&lt;/p&gt;
&lt;p&gt;About all we can say at this stage is the way you discriminate between, say, a second person singular and a third person singular, in the case of the present active indicative of λύω, is the εις vs ει. And that particular example, in the absence of seeing the other cells, may even lead one to conclude you get from the third singular to the second singular by adding a sigma.&lt;/p&gt;
&lt;p&gt;The point is there&#39;s a LOT we can&#39;t tell yet. What we CAN tell, within the set of forms with the properties held constant, is how to discriminate across forms with the morphosyntactic properties that vary. In other words, IF we have a present active indicative of λύω, how do we tell the person and number?&lt;/p&gt;
&lt;p&gt;There is one very important property of Greek morphology that we can see just in the paradigm so far: there is no &lt;em&gt;consistent&lt;/em&gt; way person is discriminated for a given number, nor number for a given person. In other words, the relationship between the forms λύω and λύομεν seems completely unrelated to that between λύεις and λύετε. And the relationship between λύω and λύεις seems completely unrelated to that between λύομεν and λύετε even though they differ in meaning in only one property. Or put another way, we can&#39;t just tell the person OR number, only the person AND number. We will talk more about this in future posts.&lt;/p&gt;
&lt;p&gt;Finally, you may be wondering &#34;why is λύω used so often?&#34;. There are multiple reasons for this choice. Firstly, as we shall see later, λύω has completely regular stem formation. Secondly the υ is robust in the face of what sounds follow it. Some Classical Greek textbooks will use παύω for the same reasons.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is the first post in a (likely long) series exploring the inflectional morphology of Greek. My goal is to work through various aspects of Greek morphology to help students think more systematically about the subject.</summary>
  </entry><entry>
    <title type="html">Modelling Derivational Morphology</title>
    <link href="https://jktauber.com/2017/05/31/modelling-derivational-morphology/" rel="alternate" type="text/html" title="Modelling Derivational Morphology"/>
    <published>2017-05-31</published>
    <updated>2017-05-31</updated>
    <id>https://jktauber.com/2017/05/31/modelling-derivational-morphology</id>
    <content type="html" xml:base="https://jktauber.com/2017/05/31/modelling-derivational-morphology/">&lt;p&gt;While most of my focus has been on inflectional morphology, I&#39;ve done a little bit of work on modelling derivational morphology and it&#39;s been a desideratum for my reader and learning algorithm work dating back to at least the original 2008 \&#34;New Kind of Graded Reader\&#34; presentations.&lt;/p&gt;
&lt;p&gt;In the 90s I was even in conversation with Harold Greenlee about putting his work online. There are numerous problems with this kind of work, though. The first is just mistakes and dubious connections. John Lee&#39;s 2013 paper &lt;em&gt;Etymological Follies: Three Recent Lexicons of the New Testament&lt;/em&gt; gives numerous examples. Lee is always worth listening to when it comes to lexicons!&lt;/p&gt;
&lt;p&gt;There&#39;s another major issue which is that expressing etymology (or even just cognate groupings) doesn&#39;t really tell you what I actually care about which is how easy is the meaning of a lexical item to learn based on other cognate lexical items you&#39;ve learned. I&#39;ve previously talked about &lt;a href=&#34;/2015/11/13/initial-thoughts-cost-learning-form/&#34;&gt;modelling the cost of learning a new form&lt;/a&gt; in the context of inflectional morphology but I&#39;m also interested (as mentioned in various &#34;New Kind of Graded Reader&#34; presentations) in the derivational equivalent between lexemes. There&#39;s some interesting theoretical work in this area going back to at least Jackendoff&#39;s 1975 paper &lt;em&gt;Morphological and Semantic Regularities in the Lexicon&lt;/em&gt;. This was picked up in Bochner&#39;s 1993 book &lt;em&gt;Simplicity in Generative Morphology&lt;/em&gt; which was a huge influence on me in thinking about morphology as paradigmatic relationships &lt;em&gt;between&lt;/em&gt; words rather than morpheme-based approaches.&lt;/p&gt;
&lt;p&gt;So for my purposes, at least, I want to model how easy it is to work out the meaning of a word from known cognates potentially given similar analogical pairs of cognates. What I&#39;d ultimately like to develop is some sort of weighting between pairs that represents how transparent the connection in meaning is from their cognate forms.&lt;/p&gt;
&lt;p&gt;Take for example the pair&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Ἰταλία:Ἰταλικός&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If that pair is known, then something like&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Γαλατία:Γαλατικός&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;is much easier to understand. So if you understand Ἰταλία, Ἰταλικός, and Γαλατία, you can almost certainly take a stab at guessing the meaning of Γαλατικός. I care about that because a big part of my research is modelling how &#34;easy&#34; a passage might be for a student to read.&lt;/p&gt;
&lt;p&gt;The analogy might be abstracted as&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;-ια:-ικος::place:person-from-that-place&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;but it also applies to things like&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Πόντος:Ποντικός&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;which is -ος:-ικος so first/second declension doesn&#39;t matter.&lt;/p&gt;
&lt;p&gt;Given a new place, you could probably easily construct a plausible denominal adjective for someone from that place with -ικος. A Greek speaker unfamiliar with the philosophical school would still immediately recognize Στοϊκός as suggesting &#34;someone from the στοά&#34; although we might want to score the transparency of that lower that those based on geographical proper nouns.&lt;/p&gt;
&lt;p&gt;But now consider&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;κοινωνία:κοινωνικός&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;εἰρήνη:εἰρηνικός&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The meaning of the &lt;em&gt;root&lt;/em&gt; clearly transfers to the lexical items in each pair but the relationship between the items in each pair is a little less transparent. It&#39;s still there if you think about it but it almost certainly needs to be weighted less. κοινωνία and εἰρήνη are not physical places. The -ικος derivative is still in some sense about something coming from somewhere but rather than a person from a place, it seems to be a state coming from another state (metaphorical place).&lt;/p&gt;
&lt;p&gt;Then you get something like&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;ὄνος:ὀνικός&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you think really really hard about it you can see how ὀνικός (in the sense of millstone) might have come from ὄνος (donkey). But this is at best a potentially useful mnemonic for learners rather than a productive derivation. It should be weighted even lower (no pun intended). And then where might&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;κέραμος:κεραμικός&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;fit in this weighting? (and to what extent do English cognates help too in cases such as this?)&lt;/p&gt;
&lt;p&gt;I&#39;m not yet sure how best to produce weightings for this kind of lexical relatedness. My guess is a first pass could be achieved by crowdsourcing on &lt;a href=&#34;http://oxlos.org&#34;&gt;oxlos&lt;/a&gt;. Ultimately, some of the weighting could be calculated via regression based on vocabulary quizzes (although I worry about confounding factors unless the students are beginners). Even just doing the crowdsourcing would be interesting to see how much agreement there was in the &#34;obvious relatedness&#34; ordering of pairs like Πόντος:Ποντικός &amp;gt; στοά:Στοϊκός &amp;gt; κοινωνία:κοινωνικός &amp;gt; ὄνος:ὀνικός.&lt;/p&gt;
&lt;p&gt;Finally, it occurs to me this gives a potential measure of &#34;false friendship&#34; amongst cognates as a mismatch between the obviousness of relatedness in form vs in meaning.&lt;/p&gt;
&lt;p&gt;I have some old work at &lt;a href=&#34;https://github.com/morphgnt/morphological-lexicon/tree/master/projects/derivational_morphology&#34;&gt;https://github.com/morphgnt/morphological-lexicon/tree/master/projects/derivational_morphology&lt;/a&gt; which I probably need to clean up at some point for all this.&lt;/p&gt;
&lt;p&gt;As is often the case, this blog post was triggered by Jonathan Robie asking me something and me realising I&#39;d never written up my thoughts on the topic despite having thought about it on and off for a decade :-)&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">While most of my focus has been on inflectional morphology, I&#39;ve done a little bit of work on modelling derivational morphology and it&#39;s been a desideratum for my reader and learning algorithm work dating back to at least the original 2008 \&#34;New Kind of Graded Reader\&#34; presentations.</summary>
  </entry><entry>
    <title type="html">Comparing Analyses from Herodotus</title>
    <link href="https://jktauber.com/2017/05/24/comparing-analyses-herodotus/" rel="alternate" type="text/html" title="Comparing Analyses from Herodotus"/>
    <published>2017-05-24</published>
    <updated>2017-05-24</updated>
    <id>https://jktauber.com/2017/05/24/comparing-analyses-herodotus</id>
    <content type="html" xml:base="https://jktauber.com/2017/05/24/comparing-analyses-herodotus/">&lt;p&gt;An analysis I did of a couple of chapters of Herodotus looks like it might be an interesting example to use for various treebanking approaches—both in terms of how things are structured as well as how they are visualised.&lt;/p&gt;
&lt;p&gt;As the last assignment for my Postgraduate Diploma in Ancient Greek, I had to write a brief commentary of Herodotus 2.35–36, which catalogs (with hasty generalisations galore) differences between Egypt and the rest of the world. The catalog consist of a series of statements of the form “Egyptians do THIS whereas everyone else does THAT” or “[In Egypt] the men do THIS and the women do THAT [as opposed to the other way around like everywhere else]”.&lt;/p&gt;
&lt;p&gt;In his commentary, Lloyd notes that this sort of catalog could be quite monotonous but that Herodotus avoids this through “skilful stylistic variation”. My commentary spent a decent proportion of its short word count digging deeper into this variation.&lt;/p&gt;
&lt;p&gt;Quite coincidentally, Greg Crane sent me some examples of student treebanking recently in the context of how to compare analyses and they happened to be of Herodotus 2.35. They differ from each other and from my own way of thinking about the sentences. Note that these aren’t difficult or ambiguous sentences, though! The syntax is easy, I just don’t think most analysis conventions and visualisation tools do a great job of capturing what’s going on.&lt;/p&gt;
&lt;p&gt;In my assignment, I started off presenting a canonical example of the construction and it’s that example that I want to show here. The original sentence is&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;τὰ ἄχθεα οἱ μὲν ἄνδρες ἐπὶ τῶν κεφαλέων φορέουσι, αἱ δὲ γυναῖκες ἐπὶ τῶν ὤμων.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;But I started off considering these sentences:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;οἱ ἄνδρες&lt;/strong&gt; τὰ ἄχθεα ἐπὶ τῶν &lt;strong&gt;κεφαλέων&lt;/strong&gt; φορέουσι&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;αἱ γυναῖκες&lt;/strong&gt; τὰ ἄχθεα ἐπὶ τῶν &lt;strong&gt;ὤμων&lt;/strong&gt; φορέουσι&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The verb (in the present, as always in these comparisons), direct object, and prepositional phrase construction are identical. What is being contrasted (shown in bold) is how the particular location (the complement in the prepositional phrase) varies with the subject.&lt;/p&gt;
&lt;p&gt;Herodotus sets up the contrast with μέν and δέ postpositives.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;οἱ &lt;strong&gt;μὲν&lt;/strong&gt; ἄνδρες τὰ ἄχθεα ἐπὶ τῶν κεφαλέων φορέουσι&lt;/p&gt;
&lt;p&gt;αἱ &lt;strong&gt;δὲ&lt;/strong&gt; γυναῖκες τὰ ἄχθεα ἐπὶ τῶν ὤμων φορέουσι&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He then alters the “constants” in the comparison, topicalising the direct object and eliding repetition of the verb. This results in:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;τὰ ἄχθεα&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;μὲν&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;b&gt;οἱ … ἄνδρες&lt;/b&gt;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;ἐπὶ τῶν &lt;b&gt;κεφαλέων&lt;/b&gt; φορέουσι&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;δὲ&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;b&gt;αἱ … γυναῖκες&lt;/b&gt;&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;ἐπὶ τῶν &lt;b&gt;ὤμων&lt;/b&gt; [φορέουσι]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The above was an indented structure I manually constructed for my commentary. It’s not machine actionable and is missing a lot but I think it does a decent job of capturing some of what&#39;s going on. It makes clear:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the topicalisation of τὰ ἄχθεα&lt;/li&gt;
&lt;li&gt;the μέν and δέ construction as a whole&lt;/li&gt;
&lt;li&gt;the elision of the verb&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is these three properties that I think make this a particularly interesting example.&lt;/p&gt;
&lt;p&gt;Here’s the first student treebank analysis:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/images/herodotus1.png&#34; width=&#34;100%&#34;&gt;&lt;/p&gt;
&lt;p&gt;The student supplies the elided verb (although it’s not co-referenced in any way) but not the elided direct object. There’s no indication of the topicalisation.&lt;/p&gt;
&lt;p&gt;It doesn’t quite seem right to me to say the two clauses are conjoined by δέ with the μέν hanging off the verb. I think of the μέν and δέ as equal partners in this construction and as tagging the two things being compared.&lt;/p&gt;
&lt;p&gt;Here’s the second student treebank analysis:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/images/herodotus2.png&#34; width=&#34;100%&#34;&gt;&lt;/p&gt;
&lt;p&gt;This analysis seems a lot more confused. The coordination is shown as being done with the μέν this time, with the δέ dangling. The prepositional phrases are shown as governed by the subjects rather than the verb.&lt;/p&gt;
&lt;p&gt;To be clear, I’m not trying to critique the students so much as raise questions for analysis conventions and visualisation, especially for reading environments and querying.&lt;/p&gt;
&lt;p&gt;Again, this (and the other sentences in Herodotus 2.35–36) aren’t difficult. I doubt either student had any trouble understanding the sentence. I just think it wasn’t clear how to adequately model their understanding of the structure.&lt;/p&gt;
&lt;p&gt;I think elision and conjunction are the biggest issues in most analyses like this and good structures and visualisation for handling those will go a long way to making treebanks more consistent and more useful.&lt;/p&gt;
&lt;p&gt;Using this sentence from Herodotus as an example, what are better ways of making sure analyses both enable useful queries and can be visualised in more perspicuous ways?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: perhaps &#34;coordination&#34; would be better than conjunction as one of the &#34;biggest issues&#34; and I think &#34;theticals&#34; (HT: Jonathan Robie) could be added to that list to make the triad: elision, coordination, and theticals.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE 2&lt;/strong&gt;: I also need to stop saying elision when I mean ellipsis! I&#39;m spending too much time with morphophonology and not enough time with syntax :-)&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">An analysis I did of a couple of chapters of Herodotus looks like it might be an interesting example to use for various treebanking approaches—both in terms of how things are structured as well as how they are visualised.</summary>
  </entry><entry>
    <title type="html">Headed to Germany Next Week</title>
    <link href="https://jktauber.com/2017/05/03/headed-germany-next-week/" rel="alternate" type="text/html" title="Headed to Germany Next Week"/>
    <published>2017-05-03</published>
    <updated>2017-05-03</updated>
    <id>https://jktauber.com/2017/05/03/headed-germany-next-week</id>
    <content type="html" xml:base="https://jktauber.com/2017/05/03/headed-germany-next-week/">&lt;p&gt;Next week I&#39;m headed to Germany for a whirlwind trip to Göttingen, Heidelberg, and Leipzig to share and discuss ideas with other scholars.&lt;/p&gt;
&lt;p&gt;I&#39;ll be speaking at a Global Philology workshop in Göttingen, attending a Digital Classics conference in Heidelberg (where I&#39;ll also have to sit the final exam for my Postgraduate Diploma in Greek if I can find someone to invigilate), and then spending a few days in Leipzig meeting with the team at the Humboldt Chair of Digital Humanities at Universität Leipzig.&lt;/p&gt;
&lt;p&gt;I&#39;m very excited to now be working more closely with the digital classics community and meeting many people whose names I&#39;ve known for a while.&lt;/p&gt;
&lt;p&gt;I&#39;m also thrilled to visit Leipzig again after more than ten years and get my fill of musical history there. I&#39;m also hoping for a bit of a physics history fill too given the importance of both Göttingen and Leipzig in the history of quantum mechanics.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Next week I&#39;m headed to Germany for a whirlwind trip to Göttingen, Heidelberg, and Leipzig to share and discuss ideas with other scholars.</summary>
  </entry><entry>
    <title type="html">Handling Morphological Ambiguity</title>
    <link href="https://jktauber.com/2017/04/21/handling-morphological-ambiguity/" rel="alternate" type="text/html" title="Handling Morphological Ambiguity"/>
    <published>2017-04-21</published>
    <updated>2017-04-21</updated>
    <id>https://jktauber.com/2017/04/21/handling-morphological-ambiguity</id>
    <content type="html" xml:base="https://jktauber.com/2017/04/21/handling-morphological-ambiguity/">&lt;p&gt;On my &lt;a href=&#34;/now/&#34;&gt;now&lt;/a&gt; page, I currently list &#34;finalising an improved set of morphology tags to use&#34; under Medium Term. As I find myself sometimes having to clarify the motivation for and state of this, I thought I&#39;d share what I just wrote in the &lt;a href=&#34;http://biblicalhumanities.org&#34;&gt;Biblical Humanities&lt;/a&gt; Slack.&lt;/p&gt;
&lt;p&gt;Firstly, some background on previous notes...&lt;/p&gt;
&lt;p&gt;Back in 2014, I wrote down some notes &lt;a href=&#34;https://github.com/morphgnt/sblgnt/wiki/Proposal-for-a-New-Tagging-Scheme&#34;&gt;Proposal for a New Tagging Scheme&lt;/a&gt; after discussions with Mike Aubrey. In 2015, after some discussions with Emma Ehrhardt, wrote down &lt;a href=&#34;https://github.com/morphgnt/sblgnt/wiki/Handling-Ambiguity&#34;&gt;Handling Ambiguity&lt;/a&gt;. Then in February 2017, after discussion on the Biblical Humanities Slack, I put forward a concrete &lt;a href=&#34;https://github.com/morphgnt/sblgnt/wiki/Proposal-for-Gender-Tagging&#34;&gt;Proposal for Gender Tagging&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here&#39;s a slightly cleaned up version of what I wrote in Slack...&lt;/p&gt;
&lt;p&gt;All I&#39;ve done is propose a way of representing certain single-feature ambiguities (especially gender but also nom/acc in neuter). I have not proposed anything for multi-feature ambiguities nor have I actually DONE any work that uses these proposals.&lt;/p&gt;
&lt;p&gt;Multi-feature ambiguities at the morphology level (1S vs 3P, GS vs AP, etc) are rarely ambiguous at the syntactic or semantic level for very good reason: the syntactic/semantic-level disambiguation is what allows one to tolerate the ambiguity at the morphology level (one reason that, as a cognitive scientist, I quite like discriminative models of morphology).&lt;/p&gt;
&lt;p&gt;But if I continue with my goal to produce a purely morphology analysis, without &#34;downward&#34; disambiguation, then I want to be able to provide a way of representing form over function AND representing ambiguity.&lt;/p&gt;
&lt;p&gt;I want to stress again that I think nom vs acc in neuter, or gender in genitive plurals is a DIFFERENT kind of ambiguity than 1S vs 3P or GS vs AP. For these multi-feature ambiguities (or what my wiki page calls extended syncretism although not sure I really like that term) it may come down to just providing a disjunction of codes, e.g. GSF∨APF.&lt;/p&gt;
&lt;p&gt;Also just in terms of motivation: clearly a morphological analysis that ignores downward disambiguation from syntax or semantics is unhelpful (and potentially even misleading) for exegesis and so a lot of use cases wouldn’t want to do it. HOWEVER, my goal is three fold:&lt;/p&gt;
&lt;p&gt;(1) I want to have a way to model the output of automated morphological analysis systems prior to either automated or human downward disambiguation;&lt;br /&gt;
(2) as someone studying how morphology works from a cognitive point of view, I care about modelling how ambiguity is resolved at different levels and so want a model that can handle that;&lt;br /&gt;
(3) because a student is quite likely to be confronted with this disambiguity, it needs to be in my learning models. I want to be able to search for cases where 1S vs 3P ambiguity or GSF vs APF ambiguity or NSN vs ASN ambiguity is resolved by syntax or semantics so they can be illustrated to the student. I want to know, for a given passage, whether such ambiguity exists so learning can be appropriately scaffolded. And note that, for me, this extends to ambiguity resolved by just accentuation as well (which is another potentially useful thing to model for various applications).&lt;/p&gt;
&lt;p&gt;In conclusion, I want to again state I&#39;m not at all against a functional, full-disambiguated parse code existing. I have NEVER proposed REPLACING the existing tagging schemes. I just want to add a new column useful for the reasons I&#39;ve listed above in (1) – (3) and produce new resources that perhaps ONLY use that purely morphological parse code.&lt;/p&gt;
&lt;p&gt;Finally I want to note there&#39;s an important difference between what we put in our data and how we present it to users. People should not assume that when I&#39;m describing codes to use in data that I&#39;m suggesting that&#39;s what end-users should see.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: one topic I didn&#39;t discuss here is ambiguity in endings that is resolved by knowledge of the stems or principal parts. For example, without a lexicon, there are ambiguities between imperfect and aorist that are easily resolved with additional lexical-level information.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">On my &lt;a href=&#34;/now/&#34;&gt;now&lt;/a&gt; page, I currently list &#34;finalising an improved set of morphology tags to use&#34; under Medium Term. As I find myself sometimes having to clarify the motivation for and state of this, I thought I&#39;d share what I just wrote in the &lt;a href=&#34;http://biblicalhumanities.org&#34;&gt;Biblical Humanities&lt;/a&gt; Slack.</summary>
  </entry><entry>
    <title type="html">An Initial Reboot of Oxlos</title>
    <link href="https://jktauber.com/2017/04/18/initial-reboot-oxlos/" rel="alternate" type="text/html" title="An Initial Reboot of Oxlos"/>
    <published>2017-04-18</published>
    <updated>2017-04-18</updated>
    <id>https://jktauber.com/2017/04/18/initial-reboot-oxlos</id>
    <content type="html" xml:base="https://jktauber.com/2017/04/18/initial-reboot-oxlos/">&lt;p&gt;In a recent post, &lt;a href=&#34;/2017/04/10/update-lxx-progress/&#34;&gt;Update on LXX Progress&lt;/a&gt;, I talked about the possibility of putting together a crowd-sourcing tool to help share the load of clarifying some parse code errors in the CATSS LXX morphological analysis. Last Friday, Patrick Altman and I spent an evening of hacking and built the tool.&lt;/p&gt;
&lt;p&gt;Back at BibleTech 2010, I gave a talk about Django, Pinax, and some early ideas for a platform built on them to do collaborative corpus linguistics. Patrick Altman was my main co-developer on some early prototypes and I ended up hiring him to work with me at Eldarion.&lt;/p&gt;
&lt;p&gt;The original project was called &lt;strong&gt;oxlos&lt;/strong&gt; after the betacode transcription of the Greek word for &#34;crowd&#34;, a nod to &#34;crowd-sourcing&#34;. Work didn&#39;t continue much past those original prototypes in 2010 and Pinax has come a long way since so, when we decided to work on oxlos again, it made sense to start from scratch. From the initial commit to launching the site took about six hours.&lt;/p&gt;
&lt;p&gt;At the moment there is one collective task available—clarifying which of a set of parse codes is valid for a given verb form in the LXX—but as the need for others arises, it will be straightforward to add them (and please contact me if you have similar tasks you&#39;d like added to the site).&lt;/p&gt;
&lt;p&gt;If you&#39;re a Django development, you are welcome to contribute. The code is open source under an MIT license and available at &lt;a href=&#34;https://github.com/jtauber/oxlos2&#34;&gt;https://github.com/jtauber/oxlos2&lt;/a&gt;. We have lots we can potentially add beyond merely different kinds of tasks.&lt;/p&gt;
&lt;p&gt;If your Greek morphology is reasonably strong, I invite you to sign up at&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;http://oxlos.org/&#34;&gt;http://oxlos.org/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;and help out with the LXX verb parsing task.&lt;/p&gt;
&lt;p&gt;It&#39;s probably not that relevant anymore, but you can watch the original 2010 talk below. I&#39;d skip past the Django / Pinax intro and go straight to about 37:00 where I start to discuss the collective intelligence platform.&lt;/p&gt;
&lt;iframe src=&#34;https://player.vimeo.com/video/10515200&#34; width=&#34;640&#34; height=&#34;363&#34; frameborder=&#34;0&#34; webkitallowfullscreen mozallowfullscreen allowfullscreen&gt;&lt;/iframe&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In a recent post, &lt;a href=&#34;/2017/04/10/update-lxx-progress/&#34;&gt;Update on LXX Progress&lt;/a&gt;, I talked about the possibility of putting together a crowd-sourcing tool to help share the load of clarifying some parse code errors in the CATSS LXX morphological analysis. Last Friday, Patrick Altman and I spent an evening of hacking and built the tool.</summary>
  </entry><entry>
    <title type="html">Analysing the Verbs in Nestle 1904</title>
    <link href="https://jktauber.com/2017/04/17/analysing-verbs-nestle-1904/" rel="alternate" type="text/html" title="Analysing the Verbs in Nestle 1904"/>
    <published>2017-04-17</published>
    <updated>2017-04-17</updated>
    <id>https://jktauber.com/2017/04/17/analysing-verbs-nestle-1904</id>
    <content type="html" xml:base="https://jktauber.com/2017/04/17/analysing-verbs-nestle-1904/">&lt;p&gt;The last couple of weeks, I&#39;ve been working on getting my &lt;code&gt;greek-inflexion&lt;/code&gt; code working on Ulrik Sandborg-Petersen&#39;s analysis of the Nestle 1904. The first pass of this is now done.&lt;/p&gt;
&lt;p&gt;The motivation for doing this work was (a) to expand the verb stem database and stemming rules; (b) to be able to annotate the Nestle 1904 with additional morphological information for my adaptive reader and some similar work Jonathan Robie is doing.&lt;/p&gt;
&lt;p&gt;My usual first step when dealing with a next text is to automatically generate as many new entries in the lexicon / stem-database as I can (see the first step in &lt;a href=&#34;/2017/04/10/update-lxx-progress/&#34;&gt;Update on LXX Progress&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;In some cases, this is just a new stem for an existing verb because of a new form of an already known verb. But sometimes it&#39;s an entirely new verb.&lt;/p&gt;
&lt;p&gt;I thought the Nestle 1904 would be considerably easier than the LXX because the text is so similar but there were numerous challenges that arose.&lt;/p&gt;
&lt;p&gt;It became clear very quickly that there were considerable differences in lemma choice between the Nestle 1904 and the MorphGNT SBLGNT. This didn&#39;t completely surprise me: I&#39;ve spend quite a bit of time cataloging lemma choice differences between lexical resources and there are considerable differences even between BDAG and Danker&#39;s Concise Lexicon.&lt;/p&gt;
&lt;p&gt;But even these aside, there were 7,743 out of 28,352 verbs mismatching after my code had already done it&#39;s best to automatically fill in missing lexical entries and stems.&lt;/p&gt;
&lt;p&gt;A. The normalisation column in Nestle 1904 doesn&#39;t normalise capitalisation, clitic accentuation, or moveable nu, all of which greek-inflexion assumes has been done.&lt;/p&gt;
&lt;p&gt;Capitalisation alone accounted for 1042 mismatches. Clitic accentuation alone accounted for 1008 mismatches. Moveable nu alone accounted for 4153 mismatches.&lt;/p&gt;
&lt;p&gt;B. Nestle 1904 systematically avoids assimilation of συν and ἐν preverbs.&lt;/p&gt;
&lt;p&gt;Taken alone, these accounted for 91 mismatches. Mapping prior to analysis by &lt;code&gt;greek-inflexion&lt;/code&gt; is somewhat of a hack that I&#39;ll address in later passes.&lt;/p&gt;
&lt;p&gt;C. There were 8 spelling differences in the endings which required an update to stemming.yaml:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;κατασκηνοῖν (PAN) in Matt 13:32&lt;/li&gt;
&lt;li&gt;κατασκηνοῖν (PAN) in Mark 4:32&lt;/li&gt;
&lt;li&gt;ἀποδεκατοῖν (PAN) in Heb 7:5&lt;/li&gt;
&lt;li&gt;φυσιοῦσθε (PMS-2P) in 1Cor 4:6&lt;/li&gt;
&lt;li&gt;εἴχαμεν (IAI.1P) in 2John 1:5&lt;/li&gt;
&lt;li&gt;εἶχαν (IAI.3P) in Mark 8:7&lt;/li&gt;
&lt;li&gt;εἶχαν (IAI.3P) in Rev 9:8&lt;/li&gt;
&lt;li&gt;παρεῖχαν (IAI.3P) in Acts 28:2&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;D. The different parse code scheme (Robinson&#39;s vs CCAT) had to be mapped over.&lt;/p&gt;
&lt;p&gt;This should have been straightforward but voice in the formal morphology field sometimes seemed to be messed up (which I corrected as part of G. below).&lt;/p&gt;
&lt;p&gt;E. There were 182 differences (type not token) in lemma choice, mostly active vs middle forms.&lt;/p&gt;
&lt;p&gt;See &lt;a href=&#34;https://gist.github.com/jtauber/28ddfeee3175903026dade4ab965ac6c#file-lemma-differences-txt&#34;&gt;https://gist.github.com/jtauber/28ddfeee3175903026dade4ab965ac6c#file-lemma-differences-txt&lt;/a&gt; for the full list.&lt;/p&gt;
&lt;p&gt;F. There were a small handful of per-form lemma corrections I made&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἐπεστείλαμεν AAI.1P ἀποστέλλω ἐπιστέλλω&lt;/li&gt;
&lt;li&gt;ἀγαθουργῶν PAP.NSM ἀγαθοεργέω ἀγαθουργέω&lt;/li&gt;
&lt;li&gt;συνειδυίης XAP.GSF συνοράω σύνοιδα&lt;/li&gt;
&lt;li&gt;γαμίσκονται PMI.3P γαμίζω γαμίσκω&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;G. Finally, I made 69 (type not token) parse code changes.&lt;/p&gt;
&lt;p&gt;See &lt;a href=&#34;https://gist.github.com/jtauber/28ddfeee3175903026dade4ab965ac6c#file-parse-txt&#34;&gt;https://gist.github.com/jtauber/28ddfeee3175903026dade4ab965ac6c#file-parse-txt&lt;/a&gt; for the list.&lt;/p&gt;
&lt;p&gt;With all this, the &lt;code&gt;greek-inflexion&lt;/code&gt; code (on a branch not yet pushed at the time of writing) can correctly generate all the the verbs in the Nestle 1904 morphology.&lt;/p&gt;
&lt;p&gt;There are definitely improvements I need to make in a second pass and at least a small number of corrections that I think need to be made to the Nestle 1904 analysis.&lt;/p&gt;
&lt;p&gt;But it&#39;s now possible for me to produce an initial verb stem annotation for the Nestle 1904 and I&#39;m a step closer to a morphological lexicon with broader coverage.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: I&#39;ve added some more parse corrections but not yet updated the gist.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">The last couple of weeks, I&#39;ve been working on getting my &lt;code&gt;greek-inflexion&lt;/code&gt; code working on Ulrik Sandborg-Petersen&#39;s analysis of the Nestle 1904. The first pass of this is now done.</summary>
  </entry><entry>
    <title type="html">Update on LXX Progress</title>
    <link href="https://jktauber.com/2017/04/10/update-lxx-progress/" rel="alternate" type="text/html" title="Update on LXX Progress"/>
    <published>2017-04-10</published>
    <updated>2017-04-10</updated>
    <id>https://jktauber.com/2017/04/10/update-lxx-progress</id>
    <content type="html" xml:base="https://jktauber.com/2017/04/10/update-lxx-progress/">&lt;p&gt;As mentioned in previous posts, I&#39;ve been working through the LXX, initially making sure my &lt;code&gt;greek-inflexion&lt;/code&gt; library can generate the same analysis of verbs as the CATSS LXX Morphology and adding to the verb stem database accordingly. This is a preliminary to being able to run the code on alternative LXX editions such as Swete and provide a freely available morphologically-tagged LXX.&lt;/p&gt;
&lt;p&gt;The general process has been, one book at a time:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;programmatically expand the stem database with missing stems where the analysis given by CATSS fits what &lt;code&gt;greek-inflexion&lt;/code&gt; stemming rules expect&lt;/li&gt;
&lt;li&gt;where the analysis from CATSS doesn&#39;t fit what &lt;code&gt;greek-inflexion&lt;/code&gt; expects, evaluate if it&#39;s&lt;ul&gt;
&lt;li&gt;a parse error in the CATSS (at this stage by far the most common problem, but also the most time consuming to identify and fix)&lt;/li&gt;
&lt;li&gt;a missing stemming rule (very rare at this stage)&lt;/li&gt;
&lt;li&gt;some temporary limitation of &lt;code&gt;greek-inflexion&lt;/code&gt; (it could be smarter about some accentuation, for example)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Working a few hours a week, it took about a month to do 1 Kings (i.e. 1 Samuel), in part because it had close to 100 parsing errors in the CATSS, many of them quite inexplicable (like getting the voice wrong when the ending should make that very easy to determine).&lt;/p&gt;
&lt;p&gt;The work up until this point covers about 35% of the LXX, but I decided for the rest to go broad rather than book-by-book.&lt;/p&gt;
&lt;p&gt;In other words, I&#39;ve expanded the stem database (per step one above) for the entire LXX in one go and will now work through the problem cases.&lt;/p&gt;
&lt;p&gt;What is very encouraging is that expanding the verbs attempted from 35% to 100% only led to 731 analysis mismatches in 1,875 locations. Given the LXX has just over 100,000 verbs, that&#39;s less than a 2% error rate.&lt;/p&gt;
&lt;p&gt;Let me be clear, however, what I&#39;m claiming. I&#39;m NOT saying I can morphologically tag verbs with 98% accuracy. I&#39;m merely saying that 98% of the CATSS LXX morphological analysis can be explained by the rules and data in &lt;code&gt;greek-inflexion&lt;/code&gt;. The other 2% is likely to MOSTLY be errors in the CATSS analysis with a few errors in my stem database, stemming rules, or accentuation rules.&lt;/p&gt;
&lt;p&gt;At the rate I worked through 1 Kings, going through the rest of the mismatches might take the rest of the year, but I think I can speed things up by batching similar kinds of mismatches together. For example, there are 586 forms where &lt;code&gt;greek-inflexion&lt;/code&gt; didn&#39;t generate the form in the CATSS analysis with the morphosyntactic properties given but was able to generate the form with different morphosyntactic properties. In almost all cases that corresponds to a mistake in the CATSS analysis. It&#39;s the most time consuming part to deal with but batching them up together (especially dealing with the same mismatch across all remaining books at once) should speed things up.&lt;/p&gt;
&lt;p&gt;It may also lend itself to crowd-sourcing. I could probably pretty easily whip up a little website that shows people the form and asks them to choose between the CATSS analysis and the &lt;code&gt;greek-inflexion&lt;/code&gt; analysis (not telling them which is which).&lt;/p&gt;
&lt;p&gt;It may be worth me spending a few hours setting that up!&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">As mentioned in previous posts, I&#39;ve been working through the LXX, initially making sure my &lt;code&gt;greek-inflexion&lt;/code&gt; library can generate the same analysis of verbs as the CATSS LXX Morphology and adding to the verb stem database accordingly. This is a preliminary to being able to run the code on alternative LXX editions such as Swete and provide a freely available morphologically-tagged LXX.</summary>
  </entry><entry>
    <title type="html">New MorphGNT Releases and Accentuation Analysis</title>
    <link href="https://jktauber.com/2017/02/15/new-morphgnt-releases-and-accentuation-analysis/" rel="alternate" type="text/html" title="New MorphGNT Releases and Accentuation Analysis"/>
    <published>2017-02-15</published>
    <updated>2017-02-15</updated>
    <id>https://jktauber.com/2017/02/15/new-morphgnt-releases-and-accentuation-analysis</id>
    <content type="html" xml:base="https://jktauber.com/2017/02/15/new-morphgnt-releases-and-accentuation-analysis/">&lt;p&gt;Over the last few weeks, I&#39;ve made a number of new releases of the MorphGNT SBLGNT analysis fixing some accentuation issues mostly in the normalization column. This came out of ongoing work on modelling accentuation (and, in particular, rules around clitics).&lt;/p&gt;
&lt;p&gt;Back in 2015, I talked about &lt;a href=&#34;/2015/11/27/annotating-normalization-column-morphgnt-part-1/&#34;&gt;Annotating the Normalization Column in MorphGNT&lt;/a&gt;. This post could almost be considered Part 2.&lt;/p&gt;
&lt;p&gt;I recently went back to that work and made a fresh start on a new repo &lt;a href=&#34;https://github.com/jtauber/gnt-accentuation&#34;&gt;gnt-accentuation&lt;/a&gt; intended to explain the accentuation of each word in the GNT (and eventually other Greek texts). There&#39;s two parts to that: explaining why the normalized form is accented the way it but then explaining why the word-in-context might be accented differently (clitics, etc). The repo is eventually going to do both but I started with the latter.&lt;/p&gt;
&lt;p&gt;My goal with that repo is to be part of the larger vision of an &#34;executable grammar&#34; I&#39;ve talked about for years where rules about, say, enclitics, are formally written up in a way that can be tested against the data. This means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;students reading a rule can immediately jump to real examples (or exceptions)&lt;/li&gt;
&lt;li&gt;students confused by something in a text can immediately jump to rules explaining it&lt;/li&gt;
&lt;li&gt;the correctness of the rules can be tested&lt;/li&gt;
&lt;li&gt;errors in the text can be found&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is the fourth point that meant that my recent work uncovered some accentuation issues in the SBLGNT, normalization and lemmatization. Some of that has been corrected in a series of new releases of the MorphGNT: 6.08, 6.09, and 6.10. See &lt;a href=&#34;https://github.com/morphgnt/sblgnt/releases&#34;&gt;https://github.com/morphgnt/sblgnt/releases&lt;/a&gt; for details of specifics. The reason for so many releases was I wanted to get corrections out as soon as I made them but then I found more issues!&lt;/p&gt;
&lt;p&gt;There are some issues in the text itself which need to be resolved. See the Github issue &lt;a href=&#34;https://github.com/morphgnt/sblgnt/issues/52&#34;&gt;https://github.com/morphgnt/sblgnt/issues/52&lt;/a&gt; for details. I&#39;d very much appreciate people&#39;s input.&lt;/p&gt;
&lt;p&gt;In the meantime, stay tuned for more progress on &lt;code&gt;gnt-accentuation&lt;/code&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Over the last few weeks, I&#39;ve made a number of new releases of the MorphGNT SBLGNT analysis fixing some accentuation issues mostly in the normalization column. This came out of ongoing work on modelling accentuation (and, in particular, rules around clitics).</summary>
  </entry><entry>
    <title type="html">First Pass of MorphGNT Verb Coverage and LXX Beginnings</title>
    <link href="https://jktauber.com/2017/01/02/first-pass-morphgnt-verb-coverage-and-lxx-beginnin/" rel="alternate" type="text/html" title="First Pass of MorphGNT Verb Coverage and LXX Beginnings"/>
    <published>2017-01-02</published>
    <updated>2017-01-02</updated>
    <id>https://jktauber.com/2017/01/02/first-pass-morphgnt-verb-coverage-and-lxx-beginnin</id>
    <content type="html" xml:base="https://jktauber.com/2017/01/02/first-pass-morphgnt-verb-coverage-and-lxx-beginnin/">&lt;p&gt;In &lt;a href=&#34;/2016/12/02/greek-inflexion-and-update-morphological-lexicon/&#34;&gt;greek-inflexion and an Update on the Morphological Lexicon&lt;/a&gt; I said that all the verbs in the MorphGNT SBLGNT analysis should be done by the end of the year. I hit that goal and made a decent start on the Septuagint.&lt;/p&gt;
&lt;p&gt;As mentioned in that previous post, by May 2016 I could generate every single verb form in:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Louise Pratt’s intermediate grammar&lt;/li&gt;
&lt;li&gt;Helma Dik’s Greek verb handouts&lt;/li&gt;
&lt;li&gt;Andrew Keller &amp;amp; Stephanie Russell’s beginner-intermediate text book&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;On December 8th, I&#39;d actually finished coverage of &lt;strong&gt;all the verbs in the MorphGNT SBLGNT&lt;/strong&gt; (with a little bit of help from Nathan Smith).&lt;/p&gt;
&lt;p&gt;The stem database is available at &lt;a href=&#34;https://github.com/jtauber/greek-inflexion/blob/morphgnt/morphgnt_lexicon.yaml&#34;&gt;https://github.com/jtauber/greek-inflexion/blob/morphgnt/morphgnt_lexicon.yaml&lt;/a&gt;. I should emphasize, though, this is just a first pass and there&#39;s more work to do but the coverage is now there.&lt;/p&gt;
&lt;p&gt;I immediately started work on applying the &lt;code&gt;greek-inflexion&lt;/code&gt; code and stemming rules to the CATSS analysis of the LXX. By the end of 2016, I&#39;d built a stem database and updated the stemming rules to cover the Pentateuch, 1 Maccabees, Jonah, Nahum, and Ezra-Nehamiah. Work on the rest of the CATSS analysis will continue over the next few months.&lt;/p&gt;
&lt;p&gt;I decided to start a new stem database from scratch for the LXX (although I recently wrote a script to compare stem databases for inconsistencies). My primary reason for this was to see if I ended up with the same analysis for a verb stem as a way of catching potential errors in my original MorphGNT analysis. The classical Greek exemplars listed above, the MorphGNT SBLGNT and the LXX analysis all share the same stemming rules, though.&lt;/p&gt;
&lt;p&gt;My reasons for doing the stem analysis on the CATSS morphological analysis were threefold:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;expand coverage of the stem database to more parts for existing verbs as well as new verbs&lt;/li&gt;
&lt;li&gt;provide broader tests for the stemming rules&lt;/li&gt;
&lt;li&gt;prepare for a morphological analysis of the Swete text of the LXX/OG.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A fourth benefit quickly emerged, though: I found errors in the CATSS analysis.&lt;/p&gt;
&lt;p&gt;I&#39;ve been maintaining patch files which, after a review pass, I&#39;ll contribute back to CCAT (if they are interested). Fun fact: it was contributing corrections back to the CCAT&#39;s GNT analysis which started me on the path to MorphGNT 24 years ago!&lt;/p&gt;
&lt;p&gt;The patches are available at &lt;a href=&#34;https://github.com/jtauber/greek-inflexion/tree/lxx/lxxmorph&#34;&gt;https://github.com/jtauber/greek-inflexion/tree/lxx/lxxmorph&lt;/a&gt;. They need to be reviewed as they all pretty much assume the text is correct (including accentuation, which was a major reason for the corrections I made) and I&#39;ve redone the analysis without considering context. &lt;strong&gt;An easy way to contribute would be to help review these patch files.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;All this work on &lt;code&gt;greek-inflexion&lt;/code&gt; has led to some improvements to the underlying &lt;code&gt;inflexion&lt;/code&gt; library as well as numerous corrections to &lt;code&gt;greek-accentuation&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Work on the LXX coverage will continue as well as expansion to other texts (both Hellenistic and Classical).&lt;/p&gt;
&lt;p&gt;Also in an early stage is better modeling of stem formation and endings.&lt;/p&gt;
&lt;p&gt;Finally, the fruits of all this will soon be applied to the online Greek reader I talked about at SBL 2016, with a goal to release a prototype for the Johannine gospel and epistles in a couple of months.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In &lt;a href=&#34;/2016/12/02/greek-inflexion-and-update-morphological-lexicon/&#34;&gt;greek-inflexion and an Update on the Morphological Lexicon&lt;/a&gt; I said that all the verbs in the MorphGNT SBLGNT analysis should be done by the end of the year. I hit that goal and made a decent start on the Septuagint.</summary>
  </entry><entry>
    <title type="html">Diacritic Stacking in Skolar PE Fixed</title>
    <link href="https://jktauber.com/2016/12/04/diacritic-stacking-skolar-pe-fixed/" rel="alternate" type="text/html" title="Diacritic Stacking in Skolar PE Fixed"/>
    <published>2016-12-04</published>
    <updated>2016-12-04</updated>
    <id>https://jktauber.com/2016/12/04/diacritic-stacking-skolar-pe-fixed</id>
    <content type="html" xml:base="https://jktauber.com/2016/12/04/diacritic-stacking-skolar-pe-fixed/">&lt;p&gt;Back in &lt;a href=&#34;/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/&#34;&gt;Polytonic Greek Unicode Still Isn’t Perfect&lt;/a&gt; and &lt;a href=&#34;/2016/02/09/updated-solution-polytonic-greek-unicodes-problems/&#34;&gt;An Updated Solution to Polytonic Greek Unicode’s Problems&lt;/a&gt; I talked about problems with stacking vowel length and other diacritics. At least in terms of the font used on this site, the problems are now solved.&lt;/p&gt;
&lt;p&gt;After discussions on the Unicode mailing list, it was clear that the solution to better handling of complex diacritic stacking in polytonic Greek was NOT more precomposed forms but better support in fonts, etc. So I reached out to David Březina, the creator of the Skolar typeface, used on this site, to see if the issues could be addressed.&lt;/p&gt;
&lt;p&gt;I&#39;m delighted to say that Březina&#39;s foundry &lt;a href=&#34;https://www.rosettatype.com&#34;&gt;Rosetta Type&lt;/a&gt; has released new versions of Skolar PE that address all the issues I had.&lt;/p&gt;
&lt;p&gt;I&#39;ve now switched over this site to use the new version, which does mean those old posts complaining about the issues will read a little funny as they won&#39;t actually show examples of the problems they purport to.&lt;/p&gt;
&lt;p&gt;Thank you, David, for listening to my input and making my favourite Greek typeface even better!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2017-01-06)&lt;/strong&gt;: turns out I also needed to add &lt;code&gt;font-feature-settings: &#34;ccmp&#34;;&lt;/code&gt; for it to work on Safari.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Back in &lt;a href=&#34;/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/&#34;&gt;Polytonic Greek Unicode Still Isn’t Perfect&lt;/a&gt; and &lt;a href=&#34;/2016/02/09/updated-solution-polytonic-greek-unicodes-problems/&#34;&gt;An Updated Solution to Polytonic Greek Unicode’s Problems&lt;/a&gt; I talked about problems with stacking vowel length and other diacritics. At least in terms of the font used on this site, the problems are now solved.</summary>
  </entry><entry>
    <title type="html">greek-inflexion and an Update on the Morphological Lexicon</title>
    <link href="https://jktauber.com/2016/12/02/greek-inflexion-and-update-morphological-lexicon/" rel="alternate" type="text/html" title="greek-inflexion and an Update on the Morphological Lexicon"/>
    <published>2016-12-02</published>
    <updated>2016-12-02</updated>
    <id>https://jktauber.com/2016/12/02/greek-inflexion-and-update-morphological-lexicon</id>
    <content type="html" xml:base="https://jktauber.com/2016/12/02/greek-inflexion-and-update-morphological-lexicon/">&lt;p&gt;Exactly seven months ago, I &lt;a href=&#34;/2016/05/01/inflexion-code-morphological-generation-parsing/&#34;&gt;released&lt;/a&gt; a generic library, &lt;code&gt;inflexion&lt;/code&gt;, and said I&#39;d soon follow it up with the Greek-specific stuff. While I did open-source the latter on GitHub as &lt;code&gt;greek-inflexion&lt;/code&gt; shortly thereafter, I didn&#39;t want to announce it here until it was further along. I&#39;m happy to say it now is.&lt;/p&gt;
&lt;p&gt;If you recall, I said back in May that &#34;it can currently generate every single verb form in Louise Pratt’s intermediate grammar, on Helma Dik’s Greek verb handouts and in Andrew Keller &amp;amp; Stephanie Russell’s beginner-intermediate text book&#34;. It now also has much better tooling for parsing new verb forms and guessing the stem of a given form. It also has the start of noun and adjective support.&lt;/p&gt;
&lt;p&gt;On a separate &lt;code&gt;morphgnt&lt;/code&gt; branch, it now has tooling for testing verb form generation against the MorphGNT/SBLGNT text. The coverage of the stem database is the gospel and epistles of John, Galatians and Mark. I expect to have complete MorphGNT/SBLGNT verb coverage by the end of the year.&lt;/p&gt;
&lt;p&gt;The repo is at &lt;a href=&#34;https://github.com/jtauber/greek-inflexion&#34;&gt;https://github.com/jtauber/greek-inflexion&lt;/a&gt;. Note that it&#39;s not pip-installable at the moment and that hasn&#39;t been a priority as it&#39;s not a library.&lt;/p&gt;
&lt;p&gt;As mentioned in my May post, most of the value (and effort) is not so much in the code but in the data. The stemming rules and, in particular, the stem database forms the core of the Morphological Lexicon I&#39;ve been working on for a few years.&lt;/p&gt;
&lt;p&gt;The best discussion of the Morphological Lexicon can be found in my &lt;a href=&#34;https://www.academia.edu/18816954/A_Morphological_Lexicon_of_New_Testament_Greek&#34;&gt;SBL 2015 Slides&lt;/a&gt; although the vision can be found way back in &lt;a href=&#34;/2004/12/09/morphgnt-v504-and-beyond/&#34;&gt;this blog post&lt;/a&gt; from 2004 where I say:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the idea is that surface forms, lexical forms, spelling variations, roots, stems, suppletion, morphophonological rules, etc. will all be catalogued with relationships between them expressed as a directed labelled graph.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So good progress is being made (and it&#39;s all available openly as work progresses) and the initial stem and morphophonological rule databases should be completed in the next month.&lt;/p&gt;
&lt;p&gt;Alongside that I&#39;m also looking at better representing relationships between stems and also relationships between the stemming rules.&lt;/p&gt;
&lt;p&gt;Ultimately, as discussed in my SBL 2015 talk and elsewhere, my goals are to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;freely provide, in a machine-actionable way, all of the morphological information normally found in a Greek lexicon&lt;/li&gt;
&lt;li&gt;facilitate tagging of new Greek texts&lt;/li&gt;
&lt;li&gt;provide the underlying information to drive a new generation of adaptive Greek readers (the topic of my 2016 SBL talk)&lt;/li&gt;
&lt;li&gt;contribute a comprehensive analysis of Ancient Greek of interest to general morphologists&lt;/li&gt;
&lt;li&gt;experiment with the notion of an &#34;executable grammar&#34; where all paradigms, rules and assertions are tested automatically against a corpus and, with it, replace the existing plethora of books on paradigms and principal parts.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Particular thanks to Jonathan Robie, who continues to provide the inspiration and encouragement for a lot of this work.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Exactly seven months ago, I &lt;a href=&#34;/2016/05/01/inflexion-code-morphological-generation-parsing/&#34;&gt;released&lt;/a&gt; a generic library, &lt;code&gt;inflexion&lt;/code&gt;, and said I&#39;d soon follow it up with the Greek-specific stuff. While I did open-source the latter on GitHub as &lt;code&gt;greek-inflexion&lt;/code&gt; shortly thereafter, I didn&#39;t want to announce it here until it was further along. I&#39;m happy to say it now is.</summary>
  </entry><entry>
    <title type="html">More on Diagramming Greek Accent Placement</title>
    <link href="https://jktauber.com/2016/11/26/more-diagramming-greek-accent-placement/" rel="alternate" type="text/html" title="More on Diagramming Greek Accent Placement"/>
    <published>2016-11-26</published>
    <updated>2016-11-26</updated>
    <id>https://jktauber.com/2016/11/26/more-diagramming-greek-accent-placement</id>
    <content type="html" xml:base="https://jktauber.com/2016/11/26/more-diagramming-greek-accent-placement/">&lt;p&gt;I&#39;ve put together slides and a voice-over to further explain Greek accent placement from a moraic point-of-view.&lt;/p&gt;
&lt;p&gt;After posting &lt;a href=&#34;/2016/11/07/diagramming-greek-accent-placement/&#34;&gt;Diagramming Greek Accent Placement&lt;/a&gt;, a couple of people asked me to unpack the second diagram, so I put together a series of slides with a view to perhaps doing a voice-over to accompany them.&lt;/p&gt;
&lt;p&gt;I put the slides up at &lt;a href=&#34;https://www.academia.edu/29725241/Basic_Greek_Accentuation&#34;&gt;https://www.academia.edu/29725241/Basic_Greek_Accentuation&lt;/a&gt; and immediately got a suggestion to do a voice-over.&lt;/p&gt;
&lt;p&gt;Here&#39;s the resultant video:&lt;/p&gt;
&lt;iframe src=&#34;https://player.vimeo.com/video/191687615&#34; width=&#34;640&#34; height=&#34;480&#34; frameborder=&#34;0&#34; webkitallowfullscreen mozallowfullscreen allowfullscreen&gt;&lt;/iframe&gt;
&lt;p&gt;&lt;a href=&#34;https://vimeo.com/191687615&#34;&gt;Basic Greek Accentuation&lt;/a&gt; from &lt;a href=&#34;https://vimeo.com/user3466366&#34;&gt;James Tauber&lt;/a&gt; on &lt;a href=&#34;https://vimeo.com&#34;&gt;Vimeo&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve put together slides and a voice-over to further explain Greek accent placement from a moraic point-of-view.</summary>
  </entry><entry>
    <title type="html">greek-accentuation 1.0.4 Released</title>
    <link href="https://jktauber.com/2016/11/26/greek-accentuation-104-released/" rel="alternate" type="text/html" title="greek-accentuation 1.0.4 Released"/>
    <published>2016-11-26</published>
    <updated>2016-11-26</updated>
    <id>https://jktauber.com/2016/11/26/greek-accentuation-104-released</id>
    <content type="html" xml:base="https://jktauber.com/2016/11/26/greek-accentuation-104-released/">&lt;p&gt;Three weeks ago I fixed a few bugs in &lt;code&gt;greek-accentuation&lt;/code&gt; and ended up doing three releases (although I only blogged about two at the time). I&#39;ve now done a fourth bug fix release: 1.0.4.&lt;/p&gt;
&lt;p&gt;1.0.3 was the bug fix mentioned in &lt;a href=&#34;/2016/11/07/diagramming-greek-accent-placement/&#34;&gt;Diagramming Greek Accent Placement&lt;/a&gt; where paroxytone wasn&#39;t being given as a possible accentuation when the penult was long and length of ultima was unknown (e.g. an unmarked alpha).&lt;/p&gt;
&lt;p&gt;To this, 1.0.4 adds two new fixes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;syllabify.is_diphthong&lt;/code&gt; now works with uppercase letters (fixes a syllabification bug when capitalized word begins with diphthong)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;syllabify.add_necessary_breathing&lt;/code&gt; now returns a NFKC normalized form (improving rebreath/debreath roundtripping)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can &lt;code&gt;pip install greek-accentuation==1.0.4&lt;/code&gt;. The repo is at &lt;a href=&#34;https://github.com/jtauber/greek-accentuation&#34;&gt;https://github.com/jtauber/greek-accentuation&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Three weeks ago I fixed a few bugs in &lt;code&gt;greek-accentuation&lt;/code&gt; and ended up doing three releases (although I only blogged about two at the time). I&#39;ve now done a fourth bug fix release: 1.0.4.</summary>
  </entry><entry>
    <title type="html">Diagramming Greek Accent Placement</title>
    <link href="https://jktauber.com/2016/11/07/diagramming-greek-accent-placement/" rel="alternate" type="text/html" title="Diagramming Greek Accent Placement"/>
    <published>2016-11-07</published>
    <updated>2016-11-07</updated>
    <id>https://jktauber.com/2016/11/07/diagramming-greek-accent-placement</id>
    <content type="html" xml:base="https://jktauber.com/2016/11/07/diagramming-greek-accent-placement/">&lt;p&gt;Cleaning up code as part of another bug fix to &lt;code&gt;greek-accentuation&lt;/code&gt; led me to update an old diagram I&#39;d done showing the Greek accentuation possibilities in terms of morae.&lt;/p&gt;
&lt;p&gt;Back in 2014 I came up with the following diagram to try to explain that the &#34;law of limitation&#34; was fairly easy to understand in terms of morae. Once you understand the acute and circumflex accents in terms of morae, it&#39;s clear that the accent can just go on one of the final three morae but that if the penult is long and the ultima short, the next-to-last mora is skipped over.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/images/mora_accent.jpg&#34;&gt;&lt;/p&gt;
&lt;p&gt;In trying to fix a bug in &lt;code&gt;greek-accentuation&lt;/code&gt;, I was stepping through all the possibilities again (with the additional complexity that the code there sometimes can&#39;t tell if a vowel is long or short). I realised it might be clearer to put the four combinations of penult/ultima length in a 2-by-2 matrix.&lt;/p&gt;
&lt;p&gt;I added a bit more information on the resulting accents and came up with this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;/images/greek_accentuation_possibilities.png&#34; width=&#34;800&#34;&gt;&lt;/p&gt;
&lt;p&gt;Let me know what you think. Do other people find this a helpful way to conceptualise things visually?&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Cleaning up code as part of another bug fix to &lt;code&gt;greek-accentuation&lt;/code&gt; led me to update an old diagram I&#39;d done showing the Greek accentuation possibilities in terms of morae.</summary>
  </entry><entry>
    <title type="html">greek-accentuation 1.0.2 Released (and How Persistent Accentuation Works)</title>
    <link href="https://jktauber.com/2016/11/04/greek-accentuation-102-released-and-how-persistent/" rel="alternate" type="text/html" title="greek-accentuation 1.0.2 Released (and How Persistent Accentuation Works)"/>
    <published>2016-11-04</published>
    <updated>2016-11-04</updated>
    <id>https://jktauber.com/2016/11/04/greek-accentuation-102-released-and-how-persistent</id>
    <content type="html" xml:base="https://jktauber.com/2016/11/04/greek-accentuation-102-released-and-how-persistent/">&lt;p&gt;Hot on the heels of the 1.0.1 bug fix, I&#39;ve released 1.0.2 with another fix, this time in the persistent accent placement. So I thought I&#39;d explain how persistent accent placement is implemented and what the bug was.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;greek-accentuation.accentuation&lt;/code&gt; has a method &lt;code&gt;persistent&lt;/code&gt; used for placing accents that are persistent, that is, they stay in place through different inflections as much as is allowed by basic accentuation rules.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;persistent&lt;/code&gt; function takes both the unaccented word to be accented and a lemma or base form that IS accented.&lt;/p&gt;
&lt;p&gt;The first step is seeing on which syllable the accent is on this base form and what type of accent it is. Note that the position of the accent is determined by the syllable position counting from the left, not the right. The code syllabifies both the word-to-be-accented and the base form. It then works out which three (or fewer) syllable placements are allowed on the word-to-be-accented based on the basic accentuation rules. This is provided by another function &lt;code&gt;possible_accentuations&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Now the first thing that&#39;s tried is whether the exact syllable position and accent type of the base is in the possible accentuations for the word-to-be-accented. If so, we&#39;re done. If not, however, we try changing the accent type from acute to circumflex while keeping it in the same position. If that&#39;s still not allowed, we iterate back, trying to place an acute on each successively later syllable until it&#39;s an accentuation allowed by the basic rules.&lt;/p&gt;
&lt;p&gt;However, this algorithm hit a problem with accenting Ἰουδαιων using the base Ἰουδαῖος.&lt;/p&gt;
&lt;p&gt;The first thing is tries is Ἰουδαῖων which of course is not permitted so it immediately jumps to an acute on the next position: Ἰουδαιών. However this is incorrect. The bug was that only a change from acute to circumflex was attempted before trying later positions. In this case, the correct thing to do was try an acute in the same position as the original circumflex.&lt;/p&gt;
&lt;p&gt;This was an easy addition and results in the correct answer: Ἰουδαίων&lt;/p&gt;
&lt;p&gt;You can &lt;code&gt;pip install greek-accentuation==1.0.2&lt;/code&gt;. The repo is at &lt;a href=&#34;https://github.com/jtauber/greek-accentuation&#34;&gt;https://github.com/jtauber/greek-accentuation&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Hot on the heels of the 1.0.1 bug fix, I&#39;ve released 1.0.2 with another fix, this time in the persistent accent placement. So I thought I&#39;d explain how persistent accent placement is implemented and what the bug was.</summary>
  </entry><entry>
    <title type="html">greek-accentuation 1.0.1 Released</title>
    <link href="https://jktauber.com/2016/11/03/greek-accentuation-101-released/" rel="alternate" type="text/html" title="greek-accentuation 1.0.1 Released"/>
    <published>2016-11-03</published>
    <updated>2016-11-03</updated>
    <id>https://jktauber.com/2016/11/03/greek-accentuation-101-released</id>
    <content type="html" xml:base="https://jktauber.com/2016/11/03/greek-accentuation-101-released/">&lt;p&gt;A minor bug fix release that fixes a problem with &lt;code&gt;add_necessary_breathing&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;My library for accenting Greek which includes a function for adding missing breathing was throwing an exception if given a word beginning with an initial uppercase vowel, e.g. Ιησους&lt;/p&gt;
&lt;p&gt;The bug has now been fixed.&lt;/p&gt;
&lt;p&gt;You can &lt;code&gt;pip install greek-accentuation==1.0.1&lt;/code&gt;. The repo is at &lt;a href=&#34;https://github.com/jtauber/greek-accentuation&#34;&gt;https://github.com/jtauber/greek-accentuation&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A minor bug fix release that fixes a problem with &lt;code&gt;add_necessary_breathing&lt;/code&gt;.</summary>
  </entry><entry>
    <title type="html">Thoughts on Voice</title>
    <link href="https://jktauber.com/2016/09/11/thoughts-voice/" rel="alternate" type="text/html" title="Thoughts on Voice"/>
    <published>2016-09-11</published>
    <updated>2016-09-11</updated>
    <id>https://jktauber.com/2016/09/11/thoughts-voice</id>
    <content type="html" xml:base="https://jktauber.com/2016/09/11/thoughts-voice/">&lt;p&gt;Occasionally I get in to conversations about the Greek middle (or voice in general) but I&#39;ve never written down my thoughts on the topic. Here&#39;s an attempt to summarize my current thinking although there&#39;s nothing particularly novel about it.&lt;/p&gt;
&lt;p&gt;Imagine a transitivity spectrum of high object-affectedness at one end and high subject-affectedness at the other end.&lt;/p&gt;
&lt;p&gt;When describing an event, there may be some freedom in where on the spectrum to go but for different choices, there&#39;s an ordering of where they would be placed relatively on the spectrum. For example, consider:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I broke the vase&lt;/li&gt;
&lt;li&gt;The vase broke&lt;/li&gt;
&lt;li&gt;The vase was broken by me&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These three descriptions of the same event would be placed, relatively, from left to right on the spectrum.&lt;/p&gt;
&lt;p&gt;Now consider each of the following pairs. If being used to describe the same event, the first of the pair would be placed on the spectrum (again, relatively) to the left of the second of the pair:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;take / choose (choosing might just be a mental decision but taking involves action)&lt;/li&gt;
&lt;li&gt;destroy / perish&lt;/li&gt;
&lt;li&gt;resolve / deliberate (resolve is a more active step beyond merely deliberating)&lt;/li&gt;
&lt;li&gt;stop / cease&lt;/li&gt;
&lt;li&gt;honor / value (you might value something but honoring it is taking action in response to that value)&lt;/li&gt;
&lt;li&gt;show / appear (you can just appear but you can also actively show someone)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now in the imperfective, Greek offers two sets of endings that can (and I stress &lt;em&gt;can&lt;/em&gt;) be used to capture the distinction between more to the left and more to the right on the spectrum. In the perfective, Greek offers three sets of endings.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;However&lt;/strong&gt;, where the line is drawn between these two or three segments of the spectrum to map them to the different endings is somewhat arbitrary between different words and it isn&#39;t always directly comparable between different tense-aspect forms either. A single set of endings might cover a pretty large part of the spectrum. There is also no &#34;requirement&#34; that a single lexeme use all ending sets available, either. Instead, voice is available as a potential way of conveying the kinds of distinctions in the pairs above and in the three-way distinction in the vase example.&lt;/p&gt;
&lt;p&gt;Where distinctions don&#39;t need to be made, it should not surprise us to find only &#34;middle&#34; forms in use, especially in cases of lower object affectedness (like in mental verbs). This does mean in the imperfective there is not a separate form for a passive but passivization is less useful (and hence less likely) in these cases. But it should also not surprise us if some mental verbs use active forms.&lt;/p&gt;
&lt;p&gt;It should also not surprise us to find, say, the future using the middle where the present uses the active. If the imperfectives only need a two-way distinction, the perfectives can also make just a two-way distinction even if choosing to use the two middle-passive forms to do so.&lt;/p&gt;
&lt;p&gt;And if only a one-way distinction is required, there is nothing odd about a lexical item choosing to use a particular one of any of the three available voice endings (although we would expect broad tendencies to be based on object-affectedness).&lt;/p&gt;
&lt;p&gt;The &#34;active&#34; is often described as unmarked with the &#34;middle&#34; marked for subject-affectedness but I think it&#39;s actually helpful to think less about markedness and more about this transitivity spectrum of relative object-affectedness vs subject-affectedness. One can then think of voice as a largely &lt;em&gt;lexically&lt;/em&gt;-determined tool for making &lt;em&gt;relative&lt;/em&gt; contrasts on this spectrum.&lt;/p&gt;
&lt;p&gt;This way of thinking means that the names of voices should probably not be so absolute but somehow be expressed in purely relative terms. The use of &#34;middle&#34; for the middle of the three isn&#39;t bad but &#34;active&#34; and &#34;passive&#34; are highly misleading although they are &#34;more active&#34; and &#34;more passive&#34; than the &#34;middle&#34; when directly contrasting &lt;em&gt;within the same lexeme&lt;/em&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Occasionally I get in to conversations about the Greek middle (or voice in general) but I&#39;ve never written down my thoughts on the topic. Here&#39;s an attempt to summarize my current thinking although there&#39;s nothing particularly novel about it.</summary>
  </entry><entry>
    <title type="html">greek-accentuation 1.0.0 Released</title>
    <link href="https://jktauber.com/2016/07/27/greek-accentuation-100-released/" rel="alternate" type="text/html" title="greek-accentuation 1.0.0 Released"/>
    <published>2016-07-27</published>
    <updated>2016-07-27</updated>
    <id>https://jktauber.com/2016/07/27/greek-accentuation-100-released</id>
    <content type="html" xml:base="https://jktauber.com/2016/07/27/greek-accentuation-100-released/">&lt;p&gt;&lt;code&gt;greek-accentuation&lt;/code&gt; has finally hit 1.0.0 with a couple more functions and a module layout change.&lt;/p&gt;
&lt;p&gt;The library (which I&#39;ve previously written about &lt;a href=&#34;/2015/11/20/greek-accentuation-library/&#34;&gt;here&lt;/a&gt;) has been sitting on 0.9.9 for a while and I&#39;ve been using it sucessfully in my inflectional morphology work for 18 months. There were, however, a couple of functions that lived in the inflectional morphology repos that really belonged in &lt;code&gt;greek-accentuation&lt;/code&gt;. They have now been moved there.&lt;/p&gt;
&lt;p&gt;There is &lt;code&gt;syllabify.debreath&lt;/code&gt; which removes smooth breathing and replaces rough breathing with an &lt;code&gt;h&lt;/code&gt;. And there is &lt;code&gt;syllabify.rebreath&lt;/code&gt; which reverses this.&lt;/p&gt;
&lt;p&gt;The other big change made is there are no-longer three top-level modules—everything is enclosed in a &lt;code&gt;greek_accentuation&lt;/code&gt; package so instead of &lt;code&gt;from syllabify import *&lt;/code&gt; you say &lt;code&gt;from greek_accentuation.syllabify import *&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You can &lt;code&gt;pip install greek-accentuation==1.0.0&lt;/code&gt;. The repo is at &lt;a href=&#34;https://github.com/jtauber/greek-accentuation&#34;&gt;https://github.com/jtauber/greek-accentuation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;greek-accentuation&lt;/code&gt; is made available under an MIT license.&lt;/p&gt;
&lt;p&gt;Thanks to Kyle Johnson of the wonderful &lt;a href=&#34;http://cltk.org&#34;&gt;Classical Language Toolkit&lt;/a&gt; project for encouraging me to finally do the 1.0.0 release.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">&lt;code&gt;greek-accentuation&lt;/code&gt; has finally hit 1.0.0 with a couple more functions and a module layout change.</summary>
  </entry><entry>
    <title type="html">More Parsing of the DCC Principal Parts</title>
    <link href="https://jktauber.com/2016/07/24/more-parsing-dcc-principal-parts/" rel="alternate" type="text/html" title="More Parsing of the DCC Principal Parts"/>
    <published>2016-07-24</published>
    <updated>2016-07-24</updated>
    <id>https://jktauber.com/2016/07/24/more-parsing-dcc-principal-parts</id>
    <content type="html" xml:base="https://jktauber.com/2016/07/24/more-parsing-dcc-principal-parts/">&lt;p&gt;This is part 7 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and looks in even more detail at the format of the principal parts list in the DCC verbs.&lt;/p&gt;
&lt;p&gt;In the &lt;a href=&#34;/2016/07/16/parsing-dcc-principal-parts/&#34;&gt;previous blog post&lt;/a&gt;, I used regular expressions to match DCC principal parts.&lt;/p&gt;
&lt;p&gt;In moving from merely matching patterns to actually extracting parts correctly, I encountered further ambiguities.&lt;/p&gt;
&lt;p&gt;Recall that previously, I just did matches like&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;{grk}, {grk}, {grk}, {grk}, {grk}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;where &lt;code&gt;{grk}&lt;/code&gt; matched any Greek word.&lt;/p&gt;
&lt;p&gt;This weekend, I expanded that to patterns more like&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;{present}, {future}, {aorist}, {perfect_active}, {aorist_passive}
{present}, {future}, {perfect_active}, {perfect_middle}, {aorist_passive}
{present}, {future}, {aorist}, {perfect_middle}, {aorist_passive}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which actually took into account the endings of the Greek words (for example &lt;code&gt;{perfect_middle}&lt;/code&gt; only matches Greek words ending in &lt;code&gt;μαι&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Note that the one pattern from the previous blog post becomes three patterns. These more precise patterns, however, enable easier extraction of the actual parts with their morphosyntactic properties.&lt;/p&gt;
&lt;p&gt;They also reveal some more inconsistencies. For example, 2nd aorists are not, it turns out, always explicitly marked.&lt;/p&gt;
&lt;p&gt;Also, the four-part pattern&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;{grk}, {grk}, {grk}, {grk}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;actually could be any of&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;{present}, {future}, {aorist}, {perfect_active}
{present}, {future}, {aorist}, {perfect_middle}
{present}, {future}, {aorist}, {aorist_passive}
{present}, {future}, {perfect_middle}, {aorist_passive}
{present}, {future}, {aorist_passive}, {perfect_middle}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The last pattern is necessitated by&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;δύναμαι, δυνήσομαι, ἐδυνήθην, δεδύνημαι
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which is, presumably, an error with &lt;code&gt;ἐδυνήθην&lt;/code&gt; and &lt;code&gt;δεδύνημαι&lt;/code&gt; transposed.&lt;/p&gt;
&lt;p&gt;Besides errors like this, there is at least one ambiguity where the endings aren&#39;t enough to disambiguate.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;χαίρω, χαιρήσω, κεχάρηκα, κεχάρημαι, ἐχάρην
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;is ambiguous because, &lt;code&gt;κα&lt;/code&gt; is a possible aorist ending. The ambiguity can obviously be resolved by looking at the entire form, but given some parts are annotated elsewhere to avoid possible misreading, it might be better to write the above as&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;χαίρω, χαιρήσω, pf. κεχάρηκα, κεχάρημαι, ἐχάρην
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;to make perfectly clear the aorist form has been skipped over.&lt;/p&gt;
&lt;p&gt;Again, my point is not to nitpick the DCC principal parts list, but rather make explicit the assumptions that principal parts in this format make.&lt;/p&gt;
&lt;p&gt;In determining what part a particular form is, the following needs to be considered:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;explicit annotation (e.g. &lt;code&gt;pf.&lt;/code&gt; for perfects)&lt;/li&gt;
&lt;li&gt;ending (&lt;code&gt;μαι&lt;/code&gt; ending a form other than the first two parts indicates the perfect middle)&lt;/li&gt;
&lt;li&gt;position in the list (both absolutely and relative to other forms who part is worked out from other considerations)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And the main upshot of all this is I&#39;ve now converted the DCC principal parts to a YAML format that I&#39;ll shortly merge in with the parts from Pratt and Morwood.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is part 7 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and looks in even more detail at the format of the principal parts list in the DCC verbs.</summary>
  </entry><entry>
    <title type="html">Parsing the DCC Principal Parts</title>
    <link href="https://jktauber.com/2016/07/16/parsing-dcc-principal-parts/" rel="alternate" type="text/html" title="Parsing the DCC Principal Parts"/>
    <published>2016-07-16</published>
    <updated>2016-07-16</updated>
    <id>https://jktauber.com/2016/07/16/parsing-dcc-principal-parts</id>
    <content type="html" xml:base="https://jktauber.com/2016/07/16/parsing-dcc-principal-parts/">&lt;p&gt;This is part 6 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and looks more precisely at the format of the principal parts list in the DCC verbs.&lt;/p&gt;
&lt;p&gt;We&#39;ve already discussed that the DCC principal parts are presented slightly differently than the Pratt or Morwood inasmuch as the latter two are in tabular form whereas the DCC list just has a string of comma-separated parts.&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/2016/06/26/formatting-principal-parts/&#34;&gt;Formatting of Principal Parts&lt;/a&gt; we touched on many of the properties of the DCC format but in the spirit of precise modeling, what I&#39;ve done below is actually write a set of regular expressions that match and enable parsing of every entry in the DCC list.&lt;/p&gt;
&lt;p&gt;In the regex patterns below, I&#39;ve used &lt;code&gt;{grk}&lt;/code&gt; for Greek words, optionally preceded by a hyphen. In my code this expands to the regex &lt;code&gt;(-?[\u0370-\u03FF\u1F00-\u1FFF]+)&lt;/code&gt;. I also have &lt;code&gt;{grk2}&lt;/code&gt; which just allows an optional second Greek word separated with &#34;or&#34; or &#34;and&#34;. &lt;code&gt;{grk2}&lt;/code&gt; hence expands to &lt;code&gt;({grk}( (or|and) {grk})?)&lt;/code&gt;. And finally, in a couple of examples, I have &lt;code&gt;{gloss}&lt;/code&gt; for glosses consisting of English words including a comma. This expands to &lt;code&gt;([a-z, ]+)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The simplest of cases just have a comma-separated list of Greek words. There may only be 1–5 rather than the full six although in these cases, the only gaps in the parts are in the final parts.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk2}, {grk}, {grk}, {grk}, {grk2}&amp;quot;
&amp;quot;{grk}, {grk}, {grk}, {grk}, {grk}&amp;quot;
&amp;quot;{grk}, {grk}, {grk}, {grk}&amp;quot;
&amp;quot;{grk}, {grk}, {grk}&amp;quot;
&amp;quot;{grk}, {grk}&amp;quot;
&amp;quot;{grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As mentioned in the previous blog posts, when the third part is a 2nd aorist, that&#39;s made explicit. Again, sometimes the 5th and 6th, or 4th, 5th and 6th parts are omitted.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk}, 2 aor\. {grk2}, {grk2}, {grk}, {grk}&amp;quot;
&amp;quot;{grk}, {grk}, 2 aor\. {grk}, {grk}&amp;quot;
&amp;quot;{grk}, {grk}, 2 aor\. {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;One pattern skips the second part but this is clear because of the explicit labeling of the third part.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, 2 aor\. {grk}, {grk}, {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;However, in one case, &#34;ἔρχομαι, fut. εἶμι or ἐλεύσομαι, 2 aor. ἦλθον, ἐλήλυθα&#34;, the second part is explictly labeled &lt;code&gt;fut.&lt;/code&gt; because it is suppletive, even though it is unmbiguously the second part by position.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, fut\. {grk2}, 2 aor\. {grk}, {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Sometimes both a 1st and 2nd aorist are given as separate parts. In the sigmatic case, &#34;ἁμαρτάνω, ἁμαρτήσομαι, ἡμάρτησα, 2 aor. ἥμαρτον, ἡμάρτηκα, ἡμάρτημαι, ἡμαρτήθην&#34;, the 1st aorist is not explicitly labeled and so the 2nd aorist is actually in the fourth position, the fourth part in the fifth position and so on.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk}, {grk}, 2 aor\. {grk}, {grk}, {grk}, {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;However, sometimes the 1st aorist in this case is labeled because it is not sigmatic and so at a glance could be confused for a perfect.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk}, 1 aor\. {grk}, 2 aor\. {grk}, {grk}, {grk}, {grk}&amp;quot;
&amp;quot;{grk}, {grk}, 1 aor\. {grk}, 2 aor\. {grk}, {grk}, {grk}&amp;quot;
&amp;quot;{grk}, {grk}, 1 aor\. {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;One example of this (matching the first line above) is &#34;φέρω, οἴσω, 1 aor. ἤνεγκα, 2 aor. ἤνεγκον, ἐνήνοχα, ἐνήνεγμαι, ἠνέχθην&#34;.&lt;/p&gt;
&lt;p&gt;In one case, &#34;μιμνήσκω, -μνήσω, -έμνησα, pf. μέμνημαι, ἐμνήσθην&#34;, the fourth part is skipped and the fifth is labeled &lt;code&gt;pf.&lt;/code&gt;. It would probably be clearer if this were labeled &lt;code&gt;pf. mid.&lt;/code&gt; or similar.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk}, {grk}, pf\. {grk}, {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In another case, &#34;ἥκω, ἥξω, pf. ἧκα&#34;, the perfect active is labeled explicitly because there&#39;s no third part and the kappa in the imperfective stem makes the perfect form perhaps harder to identify.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk}, pf\. {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Sometimes an explicit imperfect is given. This is usually at the end, after the usual parts are given.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk}, impf\. {grk}&amp;quot;
&amp;quot;{grk}, {grk}, {grk}, {grk}, impf\. {grk}&amp;quot;
&amp;quot;{grk}, {grk2}, 2 aor\. {grk}, {grk}, impf\. {grk}&amp;quot;
&amp;quot;{grk}, {grk}, 2 aor\. {grk}, {grk2}, {grk}, impf\. {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In one case, &#34;οἴομαι or οἶμαι, οἰήσομαι, impf. ᾤμην, aor. ᾠήθην&#34;, (perhaps inconsistently) the imperfect is given before the aorist.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk2}, {grk}, impf\. {grk}, aor\. {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In one case, &#34;ἀκούω, ἀκούσομαι, ἤκουσα, ἀκήκοα, plup. ἠκηκόη or ἀκηκόη, ἠκούσθην&#34;, where there is no fifth part, two forms of the pluperfect are given instead.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk}, {grk}, {grk}, plup\. {grk2}, {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In another case, however, &#34;καθίστημι, καταστήσω, κατέστησα, κατέστην, καθέστηκα, plupf. καθειστήκη, κατεστάθην&#34;, this  turns out to be a little tricky because it has both a 1st and root aorist but that fact is not made explicit.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, {grk}, {grk}, {grk}, {grk}, plupf\. {grk}, {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Also note the inconsistent use of &#34;plup.&#34; vs &#34;plupf.&#34;.&lt;/p&gt;
&lt;p&gt;There are four cases of just providing various non-standard parts just as imperfects, infinitives or participles (in one case three participle parts).&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, impf\. {grk2}, infin\. {grk}&amp;quot;
&amp;quot;{grk}, {grk}, impf\. {grk}, infin\. {grk}&amp;quot;
&amp;quot;{grk}, ptc\. {grk}&amp;quot;
&amp;quot;{grk}, infin\. {grk}, ptc\. {grk}, {grk}, {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In the case of εἶδον, the first part actually &lt;em&gt;is&lt;/em&gt; the suppletive 2nd aorist of another part.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, 2 aor\. of {grk}, act\. infin\. {grk}, mid\.infin\. {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;For our purposes this may end up getting treated differently.&lt;/p&gt;
&lt;p&gt;There are five other cases where there is additional annotation:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk}, infin\. {grk}, imper\. {grk}, plupf\. used as impf\. {grk}&amp;quot;
&amp;quot;{grk}, {grk}, {grk}, {grk}, {grk} \(but usu\. {grk} instead\), {grk}&amp;quot;
&amp;quot;{grk}, {grk}, {grk}, {grk}, {grk} \(but commonly {grk} instead\), {grk}&amp;quot;
&amp;quot;{grk} \(usually mid\. {grk}\), {grk}, {grk}, {grk}, {grk}&amp;quot;
&amp;quot;{grk}, {grk}, {grk}, 2 aor\. mid\. {grk}, pf\. {grk} \(“I have utterly destroyed”\) or {grk} \(“I am undone”\)&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And finally there are five cases that are clearly typos where the crucial comma delimiter has been ommitted or accidently replaced with a .&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;quot;{grk} {grk} {gloss}, {grk} {gloss}, 2 aor\. {grk} {gloss}, {grk} {gloss}, plup\. {grk} {gloss}, {grk} {gloss}&amp;quot;
&amp;quot;{grk}, {grk}, {grk}, {grk}\. {grk}, {grk}&amp;quot;
&amp;quot;{grk} {grk}&amp;quot;
&amp;quot;{grk} {grk}, 2 aor\. {grk}&amp;quot;
&amp;quot;{grk} {grk}, {grk}, {grk}, {grk}, {grk}&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;These correspond to:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;ἵστημι στήσω will set, ἔστησα set, caused to stand, 2 aor. ἔστην stood, ἕστηκα stand, plup. εἱστήκη stood, ἐστάθην stood
τυγχάνω, τεύξομαι, ἔτυχον, τετύχηκα. τέτυγμαι, ἐτύχθην
προσήκω προσήξω
ἕπομαι ἕψομαι, 2 aor. ἑσπόμην
βουλεύω βουλεύσω, ἐβούλευσα, βεβούλευκα, βεβούλευμαι, ἐβουλεύθην
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;These cases should probably just be fixed upstream.&lt;/p&gt;
&lt;p&gt;Now, admittedly, it probably would have been quicker for me to just manually convert the 149 strings into some completely unambiguous format rather than write regular expressions that match them all, handling typos and idiosyncracies. But the approach highlights both specific issues with the DCC list (which admittedly are quite minor, I don&#39;t want to detract from the wonderful resource the DCC Core List is) and the value of precise modeling like this in identifying inconsistencies and potential ambiguities in the way this sort of information is presented.&lt;/p&gt;
&lt;p&gt;While it&#39;s outside the scope of this blog series, I&#39;ve been exploring for a while similar tests on entire lexicon entries. This pretty quickly exposes inconsistencies. Even in cases where a markup language such as XML is used, unless it&#39;s very fine-grained markup (like the Cambridge Lexicon is/was using) lots of inconsistencies and ambiguities can creep in.&lt;/p&gt;
&lt;p&gt;All of this comes back to what I talked about in my 2015 SBL and BibleTech talks under the heading of &lt;a href=&#34;/2015/11/11/technical-aspects-openness/&#34;&gt;Technical Aspects of Openness&lt;/a&gt; and what&#39;s involved in making linguistic data truly machine-actionable.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is part 6 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and looks more precisely at the format of the principal parts list in the DCC verbs.</summary>
  </entry><entry>
    <title type="html">Formatting of Principal Parts</title>
    <link href="https://jktauber.com/2016/06/26/formatting-principal-parts/" rel="alternate" type="text/html" title="Formatting of Principal Parts"/>
    <published>2016-06-26</published>
    <updated>2016-06-26</updated>
    <id>https://jktauber.com/2016/06/26/formatting-principal-parts</id>
    <content type="html" xml:base="https://jktauber.com/2016/06/26/formatting-principal-parts/">&lt;p&gt;This is part 5 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the format of the principal parts themselves in the Pratt, Morwood and DCC verb lists.&lt;/p&gt;
&lt;p&gt;Now that we&#39;ve looked at how the various lemmas interelate, let&#39;s turn our attention to the individual part formatting. Here I just describe the various idiosyncracies. In subsequent posts, I&#39;ll discuss how to bring together (the relevant parts of) this information in single, machine-actionable format.&lt;/p&gt;
&lt;h2&gt;Pratt&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;unattested form cells have emdash &lt;code&gt;—&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;forms only found with a prefix but listed under the base verb are prefixed &lt;code&gt;-&lt;/code&gt; (often still with breathing but sometimes inconsistently not)&lt;/li&gt;
&lt;li&gt;alternative forms separated by &lt;code&gt;/&lt;/code&gt;&lt;ul&gt;
&lt;li&gt;active vs middle (this will be an important distinction in later posts)&lt;/li&gt;
&lt;li&gt;different augment handling&lt;/li&gt;
&lt;li&gt;stem alternatives&lt;/li&gt;
&lt;li&gt;other spelling differences&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;some single-letter spelling differences are just indicated with parenthetical letter (could be expanded to just use &lt;code&gt;/&lt;/code&gt; as above)&lt;/li&gt;
&lt;li&gt;aorists sometimes indicate the root in parentheses where it might not be predictable from the part (particularly useful later for inferring unaugmented stems, etc)&lt;/li&gt;
&lt;li&gt;(rarely) section number with paradigm is referenced&lt;/li&gt;
&lt;li&gt;(rarely) part-specific gloss is included&lt;/li&gt;
&lt;li&gt;forms taken from another synonymous verb indicated by &lt;code&gt;*&lt;/code&gt; (although not all suppletion indicated this way)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Morwood&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;includes seventh part for future passive&lt;/li&gt;
&lt;li&gt;vowel lengths indicated&lt;/li&gt;
&lt;li&gt;pre-contracted forms (especially in future) are shown in parentheses&lt;/li&gt;
&lt;li&gt;rare forms are in &lt;em&gt;italics&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;forms only found with a prefix but listed under the base verb are prefixed &lt;code&gt;-&lt;/code&gt; (not normally with breathing but one or two inconsistencies)&lt;/li&gt;
&lt;li&gt;alternative forms separated by &lt;code&gt;,&lt;/code&gt; (or on new line, see below)&lt;/li&gt;
&lt;li&gt;imperfect form sometimes listed under aorist column (marked &lt;code&gt;impf.&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;specifically transitive or intransitive forms sometimes marked &lt;code&gt;(tr.)&lt;/code&gt; or &lt;code&gt;(intr.)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;because alternative lemmas get their own line, corresponding forms can be lined up&lt;/li&gt;
&lt;li&gt;(rarely) page number references&lt;/li&gt;
&lt;li&gt;(rarely) part-specific glosses&lt;/li&gt;
&lt;li&gt;poetic spelling variants sometimes indicated&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;DCC Greek Core List&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;unlike Pratt and Morwood, the parts are just a comma-separated list&lt;/li&gt;
&lt;li&gt;missing forms are not indicated as such so sometimes fewer than six forms are listed; if there are gaps, the next form is sometimes annotated with which part it is (and sometimes it’s annotated even when it doesn’t need to be)&lt;/li&gt;
&lt;li&gt;second aorists are annotated with &lt;code&gt;2 aor.&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;where there is a first and second aorist, they can both be given as separate, comma-separated parts (with first annotated as &lt;code&gt;1 aor.&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;non-standard parts are sometimes given (e.g. &lt;code&gt;impf.&lt;/code&gt;, &lt;code&gt;infin.&lt;/code&gt;, &lt;code&gt;ptc.&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;forms only found with a prefix but listed under the base verb are prefixed &lt;code&gt;-&lt;/code&gt; (not normally with breathing)&lt;/li&gt;
&lt;li&gt;occasionally further annotated in parentheses, e.g.:&lt;ul&gt;
&lt;li&gt;&lt;code&gt;πειράω (usually mid. πειράομαι)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;προστέθειμαι (but commonly προσκεῖμαι instead)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;τέθειμαι (but usu. κεῖμαι instead)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;in a couple of cases forms are glossed (although inconsistently presented):&lt;ul&gt;
&lt;li&gt;&lt;code&gt;pf. ἀπολώλεκα (“I have utterly destroyed”) or ἀπόλωλα (“I am undone”)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ἵστημι στήσω will set, ἔστησα set, caused to stand, 2 aor. ἔστην stood, ἕστηκα stand, plup. εἱστήκη stood, ἐστάθην stood&lt;/code&gt; (note missing comma between first two parts)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;alternative forms just listed separated by &lt;code&gt;or&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is part 5 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the format of the principal parts themselves in the Pratt, Morwood and DCC verb lists.</summary>
  </entry><entry>
    <title type="html">Merging the DCC Lemmas</title>
    <link href="https://jktauber.com/2016/06/22/merging-dcc-lemmas/" rel="alternate" type="text/html" title="Merging the DCC Lemmas"/>
    <published>2016-06-22</published>
    <updated>2016-06-22</updated>
    <id>https://jktauber.com/2016/06/22/merging-dcc-lemmas</id>
    <content type="html" xml:base="https://jktauber.com/2016/06/22/merging-dcc-lemmas/">&lt;p&gt;This is part 4 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the Dickinson College Commentaries (DCC) Greek Core lemmas and issues in merging them with the existing merge of Pratt and Morwood.&lt;/p&gt;
&lt;p&gt;It was relatively straightfoward to merge in the lemmas from the DCC.&lt;/p&gt;
&lt;p&gt;Of the 149 verb entries in the DCC, 111 of them matched exactly with an existing Pratt or Morwood lemma (dropping length in the latter as the DCC doesn&#39;t include it).&lt;/p&gt;
&lt;p&gt;The remaining 38 cases were simple and fell in to one of nine categories:&lt;/p&gt;
&lt;h2&gt;1. multiple spellings [3]&lt;/h2&gt;
&lt;p&gt;There is only one case of multiple spellings in the DCC verbs (the first one below). In the other two cases, DCC only gives one of the spellings given by Pratt or Morwood.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;οἴομαι/οἶμαι is given as &#34;οἴομαι or οἶμαι&#34;&lt;/li&gt;
&lt;li&gt;only σκοπέω of σκέπτομαι/σκοπέω is given&lt;/li&gt;
&lt;li&gt;only μίγνυμι of μείγνυμι/μίγνυμι is given&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;2. difference in voice [4]&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;ἀποκρίνω is given, not ἀποκρίνομαι&lt;/li&gt;
&lt;li&gt;πορεύω is given, not πορεύομαι&lt;/li&gt;
&lt;li&gt;φοβέω is given, not φοβέομαι&lt;/li&gt;
&lt;li&gt;πειράω is given as &#34;πειράω (usually mid. πειράομαι)&#34;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;3. compounds of existing base (also in DCC) [17]&lt;/h2&gt;
&lt;p&gt;DCC focuses more on useful vocabulary rather than useful principal parts in its choice of which verbs to include. In this sense it&#39;s the opposite of Morwood. As a result, it include compounds where Pratt or Morwood would only include the base. In all the cases below, the DCC also includes the base (but they all fall into the categories of 111 words matching exactly).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἀναιρέω, ἀφαιρέω&lt;/li&gt;
&lt;li&gt;ὑπάρχω&lt;/li&gt;
&lt;li&gt;συμβαίνω&lt;/li&gt;
&lt;li&gt;ἀποδίδωμι, παραδίδωμι&lt;/li&gt;
&lt;li&gt;πάρειμι&lt;/li&gt;
&lt;li&gt;προσήκω&lt;/li&gt;
&lt;li&gt;παρέχω&lt;/li&gt;
&lt;li&gt;ἀποθνῄσκω&lt;/li&gt;
&lt;li&gt;ἀφίημι&lt;/li&gt;
&lt;li&gt;καθίστημι, προστίθημι&lt;/li&gt;
&lt;li&gt;καταλαμβάνω, ὑπολαμβάνω&lt;/li&gt;
&lt;li&gt;διαφέρω, συμφέρω&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;ἀποδίδωμι here is somewhat debatable as we already have ἀποδίδομαι in Pratt and Morwood but only under πωλέω.&lt;/p&gt;
&lt;h2&gt;4. compounds of existing base (not in DCC) [2]&lt;/h2&gt;
&lt;p&gt;In two cases, DCC has a compound whose base is already in Pratt or Morwood but not in DCC itself.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἀποκτείνω&lt;/li&gt;
&lt;li&gt;ἀπαλλάσσω&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;5. compounds where other compound but not base existed [1]&lt;/h2&gt;
&lt;p&gt;In one case, DCC has a compound whose base is not in Pratt, Morwood or DCC but another compound of the same base is.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;κατασκευάζω (no σκευάζω but παρασκευάζω existed)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;6. compounds with no base existing [1]&lt;/h2&gt;
&lt;p&gt;And in one case, DCC has a compound whose base, nor any other compounds of that base are in Pratt, Morwood or DCC.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;κατηγορέω&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;7. σσ vs ττ [3]&lt;/h2&gt;
&lt;p&gt;DCC favours σσ over ττ (whereas Pratt and Morwood use latter; although Morwood does have ἀλλάσσω alongside ἀλλάττω)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;πράσσω&lt;/li&gt;
&lt;li&gt;τάσσω&lt;/li&gt;
&lt;li&gt;φυλάσσω&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;8. words appearing under different entry due to suppletion [3]&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;δέδοικα (Pratt has under δείδω)&lt;/li&gt;
&lt;li&gt;εἶδον (Pratt and Morwood have under ὁράω)&lt;/li&gt;
&lt;li&gt;εἶμι (Pratt and Morwood have under ἔρχομαι)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;9. completely new words [3]&lt;/h2&gt;
&lt;p&gt;These are unique to DCC.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἔρομαι&lt;/li&gt;
&lt;li&gt;λαλέω&lt;/li&gt;
&lt;li&gt;πολεμέω&lt;/li&gt;
&lt;li&gt;ἔοικα&lt;/li&gt;
&lt;/ul&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is part 4 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the Dickinson College Commentaries (DCC) Greek Core lemmas and issues in merging them with the existing merge of Pratt and Morwood.</summary>
  </entry><entry>
    <title type="html">Merging the Morwood and Pratt Lemmas</title>
    <link href="https://jktauber.com/2016/06/21/merging-morwood-and-pratt-lemmas/" rel="alternate" type="text/html" title="Merging the Morwood and Pratt Lemmas"/>
    <published>2016-06-21</published>
    <updated>2016-06-21</updated>
    <id>https://jktauber.com/2016/06/21/merging-morwood-and-pratt-lemmas</id>
    <content type="html" xml:base="https://jktauber.com/2016/06/21/merging-morwood-and-pratt-lemmas/">&lt;p&gt;This is part 3 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the Morwood lemmas and issues in merging them with Pratt&#39;s.&lt;/p&gt;
&lt;p&gt;Like Pratt, Morwood conflates the lemma with the first principal part and similarly calls the relevant column “present”.&lt;/p&gt;
&lt;p&gt;One of the first differences one notices is that Morwood’s principal parts list indicates vowel length. This is useful in many cases for the accentuation stage of my form generating code. That Morwood indicates length and Pratt doesn’t has at least two implications: (1) it means that any matching between the lists will have to strip length (not a big deal); (2) it raises the question of whether forms in Pratt but not Morwood should somehow be tagged as underspecified for length (perhaps to be later inferred from accentuation or looked up manually in other sources).&lt;/p&gt;
&lt;p&gt;Like Pratt, Morwood indicates where a base form is used but a particular compound is more common. As we saw previously, Pratt does this by saying &lt;code&gt;αἰνέω {ἐπαινέω}&lt;/code&gt;. Morwood, in turn, says &lt;code&gt;αἰνέω (ἐπ-)&lt;/code&gt;. Each is fairly easily derivable from the other and whatever our own internal format will be, we should be able to reconstruct both the Pratt and Morwood display. However Morwood will sometimes include more than one preverb. For example &lt;code&gt;στέλλω (ἀπο-, ἐπι-)&lt;/code&gt;. In this case Pratt just gives &lt;code&gt;στέλλω&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Sometimes a single preverb will have alternative spellings (depending on assimilation) which Morwood indicates like &lt;code&gt;πίπλημι (ἐμ-/ἐν-)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;One somewhat unusual feature of Morwood is it will group synonyms such as βιόω and ζάω, or πωλέω and ἀποδίδομαι. It still puts them on separate lines, though, which enables other parts to be correlated.&lt;/p&gt;
&lt;p&gt;A similar approach is taken to spelling variations. In Morwood, these are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἀλλάσσω and ἀλλάττω&lt;/li&gt;
&lt;li&gt;ἁρμόττω and ἁρμόζω&lt;/li&gt;
&lt;li&gt;κλαίω and κλᾱ́ω (the latter of which Morwood annotates with &lt;code&gt;(in prose)&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;αὐξάνω and αὔξω&lt;/li&gt;
&lt;li&gt;μείγνῡμι and μῑ́γνῡμι&lt;/li&gt;
&lt;li&gt;οἶμαι and οἴομαι&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;each expressed as a pair of lines.&lt;/p&gt;
&lt;p&gt;There are only two other things to note about Morwood’s first column: (1) where he groups βιόω and ζάω, the latter is inexplicably put in square brackets; (2) &lt;em&gt;italics&lt;/em&gt; is occasionally used to indicate a form that is rare or non-attested. This is more often seen in parts other than the first but it does occurs in the first part in Morwood’s second list in two cases: βλώσκω and δαρθάνω (κατα).&lt;/p&gt;
&lt;h2&gt;Matching up Pratt and Morwood&lt;/h2&gt;
&lt;p&gt;There are 73 entries identical in lemma between Pratt and Morwood’s first list. There are 27 entries identical in lemma between Pratt and Morwood’s second list.&lt;/p&gt;
&lt;p&gt;There are 14 entries where Morwood simply adds vowel length but otherwise the lemmas are the same (10 in first list, 4 in second).&lt;/p&gt;
&lt;p&gt;In three cases the lemmas are in fact the same but the common compound is just formatted differently:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;αἰνέω {ἐπαινέω} vs αἰνέω (ἐπ-)&lt;/li&gt;
&lt;li&gt;θνῄσκω {ἀποθνῄσκω} vs θνῄσκω (ἀπο-)&lt;/li&gt;
&lt;li&gt;κτείνω {ἀποκτείνω} vs κτείνω (ἀπο-)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Similarly in two cases, Pratt just adds the preverb analysis:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;[ἀνα]λίσκω vs ἀνᾱλίσκω&lt;/li&gt;
&lt;li&gt;[ἀφ]ικνέομαι vs ἀφικνέομαι&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;(although note ἀνᾱλίσκω also adds vowel length)&lt;/p&gt;
&lt;p&gt;In one case, Pratt gives common compound on base entry but Morwood doesn&#39;t&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἵημι {ἀφιημι} vs ῑ̔́ημι&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;(and Morwood adds vowel length)&lt;/p&gt;
&lt;p&gt;In five cases, Pratt gives a compound with preverb analysis but Morwood has base (showing common preverb):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;[ἀν]οίγνυμι/[ἀν]οίγω vs οἴγνῡμι (ἀν-)&lt;/li&gt;
&lt;li&gt;[ἀπ]όλλυμι vs ὄλλῡμι (ἀπ-)&lt;/li&gt;
&lt;li&gt;[καθ]εύδω vs εὕδω (καθ-)&lt;/li&gt;
&lt;li&gt;[κατα]δαρθάνω vs δαρθάνω (κατα)&lt;/li&gt;
&lt;li&gt;[δια]φθείρω vs φθείρω (δια-)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;(although note Pratt also has φθείρω as separate entry; Morwood adds vowel length for οἴγνῡμι (ἀν-) and ὄλλῡμι (ἀπ-); Morwood doesn’t have the alternative ἀνοίγω for ἀνοίγνῡμι)&lt;/p&gt;
&lt;p&gt;In three cases, Pratt gives the base (as does Morwood) but Morwood adds a common preverb:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;μιμνῄσκω vs μιμνῄσκω (ἀνα-)&lt;/li&gt;
&lt;li&gt;πίμπλημι vs πίμπλημι (ἐμ-/ἐν-)&lt;/li&gt;
&lt;li&gt;στέλλω vs στέλλω (ἀπο-/ἐπι-)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;φθείρω vs φθείρω (δια-) would be included here but Pratt separately has [δια]φθείρω.&lt;/p&gt;
&lt;p&gt;Also, Pratt has an unmatched [ἀπο]κρίνομαι but Pratt and Morwood have a separate κρίνω and κρῑ́νω respectively.&lt;/p&gt;
&lt;p&gt;In two cases, Pratt gives middle form but Morwood gives active form:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;μαίνομαι vs μαίνω&lt;/li&gt;
&lt;li&gt;ψεύδομαι vs ψεύδω&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And in two cases, Morwood gives an indefinite form where Pratt gives 1st singular:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;δέω (2) vs δεῖ&lt;/li&gt;
&lt;li&gt;μέλω vs μέλει&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are 105 entries lemmas unique to Pratt (although this includes [δια]λέγομαι and [συλ]λέγω which could be mapped to λέγω). Most of these entries appear to be regular and so, given Morwood’s focus on irregular verbs, it is not surprising there are omissions.&lt;/p&gt;
&lt;p&gt;Morwood’s first list adds three new lemmas: ἀποδίδομαι (grouped under πωλέω with which it&#39;s suppletive in 3rd part), βιόω and χρή.&lt;/p&gt;
&lt;p&gt;Morwood’s second list adds 42 new lemmas: ἄγνῡμι, αἰδέομαι, ἀλείφω, ἅλλομαι, ἁρμόττω / ἁρμόζω, βλώσκω, ἐξετάζω, ζεύγνῡμι, ζέω, καθαίρω, καλύπτω, κείρω, κεράννῡμι, κερδαίνω, κηρῡ́ττω, κρεμάννῡμι, νέω, ὄζω, ὀνινημι, ὀρύττω, ὀσφραίνομαι, ὀφλισκάνω, παίω, περαίνω, πέρδομαι, πετάννῡμι (ἀνα-), πέτομαι, πήγνῡμι, πίμπρημι (ἐμ-/ἐν-), πνέω, σβέννῡμι, σκάπτω, σπάω, σπείρω, σπένδω, σφάλλω, τελέω, τήκω, ὑφαίνω, φείδομαι, χρῑ́ω, ὠθέω.&lt;/p&gt;
&lt;h2&gt;Concluding Thoughts&lt;/h2&gt;
&lt;p&gt;Inclusion of vowel length and differences in how common compounds are shown are easy to handle in any model merging these two lists. If bases and compounds get individual entries containing their parts but are otherwise linked via additional properties, we get around those issues too.&lt;/p&gt;
&lt;p&gt;However there remain four open issues to deal with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;whether spelling differences that don&#39;t span all parts should get separate entries.&lt;/li&gt;
&lt;li&gt;how to handle one list giving form in active but another in middle&lt;/li&gt;
&lt;li&gt;how to handle one list giving indefinite its own entry, the other putting it under the first person singular&lt;/li&gt;
&lt;li&gt;situations where one list uses forms from one lexeme for some of the parts of another&lt;/li&gt;
&lt;/ul&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is part 3 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the Morwood lemmas and issues in merging them with Pratt&#39;s.</summary>
  </entry><entry>
    <title type="html">Sources of Principal Part Lists</title>
    <link href="https://jktauber.com/2016/06/18/sources-principal-part-lists/" rel="alternate" type="text/html" title="Sources of Principal Part Lists"/>
    <published>2016-06-18T17:35:09</published>
    <updated>2016-06-18T17:35:09</updated>
    <id>https://jktauber.com/2016/06/18/sources-principal-part-lists</id>
    <content type="html" xml:base="https://jktauber.com/2016/06/18/sources-principal-part-lists/">&lt;p&gt;This is part 1 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the three sources of Attic Greek principal parts used to expand and test the &lt;em&gt;Morphological Lexicon&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Because Louise Pratt’s &lt;em&gt;The Essentials of Greek Grammar&lt;/em&gt; was the basis for testing a lot of paradigms, it made sense to use it as the starting point for Attic Greek principal parts as well. Pratt lists the principal parts (the standard six, i.e. not separating out the so-called “future passive”) for 247 verbs. It is not indicated the reason for her particular choice of verbs other than them being &#34;common Attic Verbs&#34;.&lt;/p&gt;
&lt;p&gt;The second source is James Morwood’s &lt;em&gt;Oxford Grammar of Classical Greek&lt;/em&gt;. Morwood has two lists, one of &#34;Top 101 irregular verbs&#34; and one of (81) &#34;More principal parts&#34;. The title of the first list suggests common verbs are omitted if regular. I have included both lists (although can treat them separately). Morwood includes a seventh part for the “future passive” (when and why this is useful is worthy of a separate blog post).&lt;/p&gt;
&lt;p&gt;For my third source I used Chris Francese’s principal parts in the wonderful &lt;a href=&#34;http://dcc.dickinson.edu/greek-core-list&#34;&gt;DCC Greek Core Vocabulary list&lt;/a&gt;. The DCC core vocabulary consists of 500 common words of which 151 are verbs.&lt;/p&gt;
&lt;p&gt;All three lists included the occasional form outside the usual six or seven principal parts and a future post in this blog series will address the modelling of that.&lt;/p&gt;
&lt;p&gt;The DCC principal parts were in electronic form and so were relatively easy to deal with (although I’ll discuss specifics in a later post). Both the Pratt and Morwood lists I did not have in electronic form and so manually keyed them in over the course of a few weeks (mostly in Vienna earlier this year).&lt;/p&gt;
&lt;p&gt;I have also referred at times to Wilfred Major’s 80% list (discussed &lt;a href=&#34;/2015/10/30/core-vocabulary-new-testament-greek/&#34;&gt;elsewhere&lt;/a&gt; on this blog) but, as it doesn’t contain principal parts, it was more of a reference for lemma choice and additional metadata than an input for testing part generation itself.&lt;/p&gt;
&lt;p&gt;Of course many other lists could be included but these three are sufficient to establish most of the modelling issues and ensure the code works correctly. Data from other lists can be incorporated later relatively easily.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is part 1 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the three sources of Attic Greek principal parts used to expand and test the &lt;em&gt;Morphological Lexicon&lt;/em&gt;.</summary>
  </entry><entry>
    <title type="html">Lemmas in the Pratt Principal Parts</title>
    <link href="https://jktauber.com/2016/06/18/lemmas-pratt-principal-parts/" rel="alternate" type="text/html" title="Lemmas in the Pratt Principal Parts"/>
    <published>2016-06-18T23:51:03</published>
    <updated>2016-06-18T23:51:03</updated>
    <id>https://jktauber.com/2016/06/18/lemmas-pratt-principal-parts</id>
    <content type="html" xml:base="https://jktauber.com/2016/06/18/lemmas-pratt-principal-parts/">&lt;p&gt;This is part 2 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the complexities in the notion of a lemma identifying lexical entries, specifically in the Pratt principal parts.&lt;/p&gt;
&lt;p&gt;Before we get to the other principal parts beyond the first, there is a lot to be discussed just about the first part and its use as a lemma, identifying the lexical entry to which all the parts belong. In this post, we’ll start just looking at the presentation of lemmas in the Pratt list and in the next post move on to the other sources and the problems of merging multiple lists that may differ in choice of lemma for the same lexical entry.&lt;/p&gt;
&lt;p&gt;The canonical lemma / first principal part is the present active (or middle) indicative first person singular of the verb but there are at least eight ways in which the first column in the Pratt principal parts table differs from this ideal.&lt;/p&gt;
&lt;h2&gt;1. Contract verbs&lt;/h2&gt;
&lt;p&gt;The present active indicative first person singular of a contract verbs like ἀγαπάω is, of course, not ἀγαπάω but ἀγαπῶ. The pre-contract version is often used (and is indeed used by Pratt) in lemmas and the first principal part so the stem vowel is explicit (as it’s necessary for generating other forms).&lt;/p&gt;
&lt;h2&gt;2. Base Verbs With a More Common Compound&lt;/h2&gt;
&lt;p&gt;Where a base verb gets its own entry but there is a more common compound, Pratt includes the latter in braces:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;αἰνέω {ἐπαινέω}&lt;/li&gt;
&lt;li&gt;ἀπατάω {ἐξαπατάω}&lt;/li&gt;
&lt;li&gt;θνῄσκω {ἀποθνῄσκω}&lt;/li&gt;
&lt;li&gt;ἵημι {ἀφιημι}&lt;/li&gt;
&lt;li&gt;κτείνω {ἀποκτείνω}&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note that the other parts in this case are still given just for the base verb, even if that means they are not attested in Greek texts.&lt;/p&gt;
&lt;h2&gt;3. Compound Verbs&lt;/h2&gt;
&lt;p&gt;In some cases only one compound verb gets an entry, but the preverb is indicated in square brackets:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;[ἀνα]λίσκω&lt;/li&gt;
&lt;li&gt;[ἀν]οίγνυμι/[ἀν]οίγω&lt;/li&gt;
&lt;li&gt;[ἀπ]αντάω&lt;/li&gt;
&lt;li&gt;[ἀπο]κρίνομαι&lt;/li&gt;
&lt;li&gt;[ἀπ]όλλυμι&lt;/li&gt;
&lt;li&gt;[ἀπο]λογέομαι&lt;/li&gt;
&lt;li&gt;[ἀφ]ικνέομαι&lt;/li&gt;
&lt;li&gt;[δια]λέγομαι&lt;/li&gt;
&lt;li&gt;[δια]νοέομαι&lt;/li&gt;
&lt;li&gt;[δια]φθείρω&lt;/li&gt;
&lt;li&gt;[δι]ηγέομαι&lt;/li&gt;
&lt;li&gt;[ἐκ]πλήττω&lt;/li&gt;
&lt;li&gt;[ἐπι]θυμέω&lt;/li&gt;
&lt;li&gt;[ἐπι]μελ(έ)ομαι&lt;/li&gt;
&lt;li&gt;[ἐπι]τηδεύω&lt;/li&gt;
&lt;li&gt;[ἐπι]χειρέω&lt;/li&gt;
&lt;li&gt;[καθ]εύδω&lt;/li&gt;
&lt;li&gt;[καθ]ίζω&lt;/li&gt;
&lt;li&gt;[κατα]δαρθάνω&lt;/li&gt;
&lt;li&gt;[παρα]σκευάζω&lt;/li&gt;
&lt;li&gt;[συλ]λέγω&lt;/li&gt;
&lt;li&gt;[ὑπ]οπτεύω&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It seems that compound verbs with a common base used for other compound verbs don’t get their own entries at all in Pratt and the base verb is to be referred to in that case. This is one example where bringing in metadata from Major’s list is potentially useful, in making sure common compound verbs can easily be looked up in their base verb form.&lt;/p&gt;
&lt;h2&gt;4. Multiple Present Stems Conjoined with Slashes&lt;/h2&gt;
&lt;p&gt;In these cases there are multiple alternative present (or more properly imperfective) stems conjoined with a slash.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;[ἀν]οίγνυμι/[ἀν]οίγω&lt;/li&gt;
&lt;li&gt;αὔξω/αὐξάνω&lt;/li&gt;
&lt;li&gt;καίω/κάω&lt;/li&gt;
&lt;li&gt;κλάω/κλαίω&lt;/li&gt;
&lt;li&gt;μείγνυμι/μίγνυμι&lt;/li&gt;
&lt;li&gt;οἴομαι/οἶμαι&lt;/li&gt;
&lt;li&gt;σκέπτομαι/σκοπέω&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While these could arguably be treated as separate lemmas (and hence lexical entries) there are two arguments against doing this: (1) the two forms given are really just alternative spellings; (2) the lexical entries converge in other parts.&lt;/p&gt;
&lt;h2&gt;5. Homographs That Differ In Other Parts&lt;/h2&gt;
&lt;p&gt;δέω has two senses that, while identical in form in the first part, differ in other parts.&lt;/p&gt;
&lt;h2&gt;6. Spelling Differences with Optional Letter in Parentheses&lt;/h2&gt;
&lt;p&gt;There are two cases where an optional epsilon is given in parentheses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;[ἐπι]μελ(έ)ομαι&lt;/li&gt;
&lt;li&gt;οἰκτ(ε)ίρω&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In some cases the spelling alternative continues into other parts.&lt;/p&gt;
&lt;h2&gt;7. Lexemes Where Other Lexemes are Merged In for Other Parts&lt;/h2&gt;
&lt;p&gt;These aren’t marked in the lemma itself but I’ve included them here as they represent a particular choice of lemma to group parts under. The actual parts from other lexemes are indicated by an asterisk in Pratt. Note that this is not the same as suppletion although arguably there is a fine line worth exploring in more detail at some point.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ἔρχομαι&lt;/li&gt;
&lt;li&gt;ἐρωτάω&lt;/li&gt;
&lt;li&gt;ἐσθίω&lt;/li&gt;
&lt;li&gt;λέγω&lt;/li&gt;
&lt;li&gt;πωλέω&lt;/li&gt;
&lt;li&gt;ὠνέομαι&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;8. Lexemes Without An Imperfective Stem&lt;/h2&gt;
&lt;p&gt;Some words like οἶδα have a lemma which is from a part other than the first. While in some cases when this happens, the lexeme has been merged with another (see 7), this category covers the case where it hasn’t been.&lt;/p&gt;
&lt;h2&gt;Concluding Thoughts&lt;/h2&gt;
&lt;p&gt;We’ll see further issues when we look at the other lists and how to merge them but for now let’s discuss possible solutions to the issues seen already.&lt;/p&gt;
&lt;p&gt;It is important to note that the information in the first column of the Pratt principle parts table (headed “present”) in the book is serving a number of distinct purposes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;providing an identifier for the entire row (what could properly be called the “lemma”)&lt;/li&gt;
&lt;li&gt;providing the first principal part (and hence the present / imperfective stem)&lt;/li&gt;
&lt;li&gt;providing additional information about the lexeme such as its preverb / base&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;By separating these out we have a much clearer way forward. The lemma proper can really be any unique identifier and it can be treated completely opaquely. The first principal part (or parts when there is more than one under a single lemma) can be a separate field. Finally, information such as preverb / base decomposition can be expressed in yet further separate fields. This keeps the first principal part free of extra characters and the lemma opaque.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is part 2 of a series of blog posts about &lt;a href=&#34;/2016/06/17/modelling-stems-and-principal-part-lists/&#34;&gt;modelling stems and principal part lists&lt;/a&gt; and covers the complexities in the notion of a lemma identifying lexical entries, specifically in the Pratt principal parts.</summary>
  </entry><entry>
    <title type="html">Modelling Stems and Principal Part Lists</title>
    <link href="https://jktauber.com/2016/06/17/modelling-stems-and-principal-part-lists/" rel="alternate" type="text/html" title="Modelling Stems and Principal Part Lists"/>
    <published>2016-06-17</published>
    <updated>2016-06-17</updated>
    <id>https://jktauber.com/2016/06/17/modelling-stems-and-principal-part-lists</id>
    <content type="html" xml:base="https://jktauber.com/2016/06/17/modelling-stems-and-principal-part-lists/">&lt;p&gt;This is part 0 of a series of blog posts about modelling stems and principal part lists, particularly for Attic Greek but hopefully more generally applicable. This is largely writing up work already done but I’m doing cleanup as I go along as well.&lt;/p&gt;
&lt;p&gt;A core part of the handling of verbs in the &lt;em&gt;Morphological Lexicon&lt;/em&gt; is the set of terminations and sandhi rules that can generate paradigms attested in grammars like Louise Pratt’s &lt;em&gt;The Essentials of Greek Grammar&lt;/em&gt;. Another core part is the stem information for a broader range of verbs usually conveyed in works like Pratt’s in the form of lists of principal parts.&lt;/p&gt;
&lt;p&gt;A rough outline of (future) posts is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;/2016/06/18/sources-principal-part-lists/&#34;&gt;the sources of principal part lists for this work&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/2016/06/18/lemmas-pratt-principal-parts/&#34;&gt;lemmas in the Pratt principal parts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/2016/06/21/merging-morwood-and-pratt-lemmas/&#34;&gt;merging the Morwood and Pratt lemmas&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/2016/06/22/merging-dcc-lemmas/&#34;&gt;merging the DCC lemmas&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/2016/06/26/formatting-principal-parts/&#34;&gt;formatting of principal parts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/2016/07/16/parsing-dcc-principal-parts/&#34;&gt;parsing the DCC principal parts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/2016/07/24/more-parsing-dcc-principal-parts/&#34;&gt;more parsing the DCC principal parts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;how to model a merge of the lists&lt;/li&gt;
&lt;li&gt;inferring stems from principal parts&lt;/li&gt;
&lt;li&gt;stems, terminations and sandhi&lt;/li&gt;
&lt;li&gt;relationships between stems&lt;/li&gt;
&lt;li&gt;???&lt;/li&gt;
&lt;/ul&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This is part 0 of a series of blog posts about modelling stems and principal part lists, particularly for Attic Greek but hopefully more generally applicable. This is largely writing up work already done but I’m doing cleanup as I go along as well.</summary>
  </entry><entry>
    <title type="html">pyuca Published in The Journal of Open Source Software</title>
    <link href="https://jktauber.com/2016/05/19/pyuca-published-journal-open-source-software/" rel="alternate" type="text/html" title="pyuca Published in The Journal of Open Source Software"/>
    <published>2016-05-19</published>
    <updated>2016-05-19</updated>
    <id>https://jktauber.com/2016/05/19/pyuca-published-journal-open-source-software</id>
    <content type="html" xml:base="https://jktauber.com/2016/05/19/pyuca-published-journal-open-source-software/">&lt;p&gt;A research career requires publication in peer-reviewed journals but what if some of your scholarly output is in the form of software? The Journal of Open Source Software attempts to solve that by essentially wrapping peer-reviewed software packages up as lightweight papers. My pyuca library was just accepted for publication by the journal.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/pyuca&#34;&gt;pyuca&lt;/a&gt; is a Python implementation of the Unicode Collation Algorithm and is a vital part of most of my Greek work because it lets me properly sort Greek words. It&#39;s not limited to Greek, though, and the library is potentially useful for anyone doing text processing using Python on natural languages other than English.&lt;/p&gt;
&lt;p&gt;pyuca has always been citable in an ad-hoc fashion, but thanks to publication in &lt;a href=&#34;http://joss.theoj.org/about&#34;&gt;The Journal of Open Source Software&lt;/a&gt;, it can now be cited as a peer-reviewed journal article.&lt;/p&gt;
&lt;p&gt;The submission process was straightforward. I dug up an &lt;a href=&#34;https://en.wikipedia.org/wiki/ORCID&#34;&gt;ORCID&lt;/a&gt; (a persistent identifier for researchers) I&#39;d acquired a while ago but never used and set up my GitHub repo on &lt;a href=&#34;https://zenodo.org&#34;&gt;Zenodo&lt;/a&gt; so a &lt;a href=&#34;https://en.wikipedia.org/wiki/Digital_object_identifier&#34;&gt;Digital Object Identifier&lt;/a&gt; (DOI) gets minted for each release.&lt;/p&gt;
&lt;p&gt;I then added a specially-formatted &lt;a href=&#34;https://github.com/jtauber/pyuca/blob/master/paper.md&#34;&gt;paper.md&lt;/a&gt; file to the repo (including my ORCID, abstract about the software and any references) and submitted the repo for consideration.&lt;/p&gt;
&lt;p&gt;JOSS reviews are done openly using GitHub issues. A reviewer stepped up and gave some excellent feedback on the usage example in my README and on adding contributor guidelines. Once I&#39;d addressed that feedback, the paper was accepted by the reviewer and the editor-in-chief and a new DOI was minted for the paper itself.&lt;/p&gt;
&lt;p&gt;I also got a notification from ORCID that &lt;a href=&#34;http://crossref.org&#34;&gt;Crossref&lt;/a&gt; had found a new work to be added to my ORCID record.&lt;/p&gt;
&lt;p&gt;Of course, I could at some point write an article &lt;em&gt;about&lt;/em&gt; pyuca but an article about software is not the same as the software itself (they would likely have quite different audiences) and so citing an article about particular software is not the same as citing the software itself. Thanks to JOSS, the distinction can be maintained while still keeping within a framework of peer-reviewed journal articles.&lt;/p&gt;
&lt;p&gt;I&#39;m particularly excited that JOSS accepted software with a digital humanities application rather than their typical scientific computing applications.&lt;/p&gt;
&lt;p&gt;So if you publish a work that made use of pyuca, you can now cite it as:&lt;/p&gt;
&lt;p&gt;Tauber, J. K. (2016). pyuca: a Python implementation of the Unicode Collation Algorithm. The Journal of Open Source Software. DOI: 10.21105/joss.00021&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A research career requires publication in peer-reviewed journals but what if some of your scholarly output is in the form of software? The Journal of Open Source Software attempts to solve that by essentially wrapping peer-reviewed software packages up as lightweight papers. My pyuca library was just accepted for publication by the journal.</summary>
  </entry><entry>
    <title type="html">Varro’s Four Parts of Speech for Latin</title>
    <link href="https://jktauber.com/2016/05/04/varros-four-parts-speech-latin/" rel="alternate" type="text/html" title="Varro’s Four Parts of Speech for Latin"/>
    <published>2016-05-04</published>
    <updated>2016-05-04</updated>
    <id>https://jktauber.com/2016/05/04/varros-four-parts-speech-latin</id>
    <content type="html" xml:base="https://jktauber.com/2016/05/04/varros-four-parts-speech-latin/">&lt;p&gt;In my post &lt;a href=&#34;/2015/11/05/morphological-parts-speech-greek/&#34;&gt;Morphological Parts of Speech in Greek&lt;/a&gt; last year, I presented a model of five or six parts of speech based purely on what they inflect for. I just found out Varro suggested similar for Latin over two thousand years ago.&lt;/p&gt;
&lt;p&gt;In his article &lt;cite&gt;Dionysius Thrax vs Marcus Varro&lt;/cite&gt; in Historiographia Linguistica 17:1-2 (1990), Daniel Taylor argues for the greater significance of Varro over Thrax in the history of Greco-Roman lingustics.&lt;/p&gt;
&lt;p&gt;I actually started reading the article for comparisons made with Theodosius but his description of Varro&#39;s parts of speech caught my eye. After introducing Thrax&#39;s list of eight parts of speech for Greek (noun, verb, participle, article, pronoun, preposition, adverb, and conjunction) which has dominated since, he describes Varro&#39;s for Latin:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;His definitions are exclusively grammatical, and there are but four parts of speech: one with case, one with tense, one with both, one with neither.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This results in a similar division to the first table in my &lt;a href=&#34;/2015/11/05/morphological-parts-speech-greek/&#34;&gt;earlier blog post&lt;/a&gt; although conflates infinitives and finite verbs (which Thrax does as well).&lt;/p&gt;
&lt;p&gt;It&#39;s certainly appealing as an initial taxonomy of parts of speech, for Greek as well as Latin.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In my post &lt;a href=&#34;/2015/11/05/morphological-parts-speech-greek/&#34;&gt;Morphological Parts of Speech in Greek&lt;/a&gt; last year, I presented a model of five or six parts of speech based purely on what they inflect for. I just found out Varro suggested similar for Latin over two thousand years ago.</summary>
  </entry><entry>
    <title type="html">Inflexion: Generic Code for Morphological Generation and Parsing</title>
    <link href="https://jktauber.com/2016/05/01/inflexion-code-morphological-generation-parsing/" rel="alternate" type="text/html" title="Inflexion: Generic Code for Morphological Generation and Parsing"/>
    <published>2016-05-01</published>
    <updated>2016-05-01</updated>
    <id>https://jktauber.com/2016/05/01/inflexion-code-morphological-generation-parsing</id>
    <content type="html" xml:base="https://jktauber.com/2016/05/01/inflexion-code-morphological-generation-parsing/">&lt;p&gt;Over the last few years, I&#39;ve worked on a number of iterations of code that can generate Ancient Greek verb forms. I&#39;ve now broken out the Greek-specific pieces and released a generic library called &lt;strong&gt;inflexion&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;There&#39;s nothing particularly innovative about the approach from a computational morphology point of view: it just uses a stem database combined with a list of endings including sandhi rules. I talked a bit about the endings / sandhi rules in &lt;a href=&#34;/2015/11/22/morphological-lexicon-new-testament-greek-slides/&#34;&gt;my SBL talk last year&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It takes a very practical approach, though, and, with a suitable stem database, ending / sandhi rules and accentuation code (all of which I&#39;m releasing separately shortly) it can currently generate every single verb form in Louise Pratt&#39;s intermediate grammar, on Helma Dik&#39;s Greek verb handouts and in Andrew Keller &amp;amp; Stephanie Russell&#39;s beginner-intermediate text book.&lt;/p&gt;
&lt;p&gt;There&#39;s some support for parsing forms if the stem is known and I&#39;ll soon be working on support for when the necessary stem is not yet in the database. There&#39;s not yet any notion of stems being related and that will be a big part of future work which might be more interesting from a computational morphology point of view.&lt;/p&gt;
&lt;p&gt;In a way, the real power (or &#34;knowledge&#34;) is in the pieces not included in this library itself but I wanted to break out the generic code partly in case other people wanted to use it for other inflected languages but mostly just to keep my own code more modular.&lt;/p&gt;
&lt;p&gt;The GitHub repo is &lt;a href=&#34;https://github.com/jtauber/inflexion&#34;&gt;https://github.com/jtauber/inflexion&lt;/a&gt; and example-based &lt;a href=&#34;https://github.com/jtauber/inflexion/blob/master/docs.rst&#34;&gt;documentation&lt;/a&gt; is available.&lt;/p&gt;
&lt;p&gt;Stay tuned for new releases of the inflexion library but also the stem database, ending / sandhi rules and accentuation code that are specific to Greek.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Over the last few years, I&#39;ve worked on a number of iterations of code that can generate Ancient Greek verb forms. I&#39;ve now broken out the Greek-specific pieces and released a generic library called &lt;strong&gt;inflexion&lt;/strong&gt;.</summary>
  </entry><entry>
    <title type="html">17th International Morphology Meeting</title>
    <link href="https://jktauber.com/2016/02/19/17th-international-morphology-meeting/" rel="alternate" type="text/html" title="17th International Morphology Meeting"/>
    <published>2016-02-19</published>
    <updated>2016-02-19</updated>
    <id>https://jktauber.com/2016/02/19/17th-international-morphology-meeting</id>
    <content type="html" xml:base="https://jktauber.com/2016/02/19/17th-international-morphology-meeting/">&lt;p&gt;I&#39;m current in Vienna for the &lt;a href=&#34;https://www.wu.ac.at/en/home/imm17/&#34;&gt;International Morphology Meeting&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It&#39;s been quite an adventure to get here, which you can read about &lt;a href=&#34;https://thoughtstreams.io/jtauber/lost-passport-adventure-2016/&#34;&gt;elsewhere&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If four days full of  morphology weren&#39;t enough there are workshops specifically on computational methods and discriminative approaches, both of which are obviously of huge interest to me.&lt;/p&gt;
&lt;p&gt;I&#39;m also hoping to catch up with Jim Blevins who is a sort of undergraduate version of a Doktorvater to me.&lt;/p&gt;
&lt;p&gt;I&#39;m sure in the coming months you&#39;ll see a lot on this blog the seeds of which will have been sown at this conference.&lt;/p&gt;
&lt;p&gt;(and yes, that was a legitimate use of the future perfect)&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;m current in Vienna for the &lt;a href=&#34;https://www.wu.ac.at/en/home/imm17/&#34;&gt;International Morphology Meeting&lt;/a&gt;.</summary>
  </entry><entry>
    <title type="html">An Updated Solution to Polytonic Greek Unicode’s Problems</title>
    <link href="https://jktauber.com/2016/02/09/updated-solution-polytonic-greek-unicodes-problems/" rel="alternate" type="text/html" title="An Updated Solution to Polytonic Greek Unicode’s Problems"/>
    <published>2016-02-09</published>
    <updated>2016-02-09</updated>
    <id>https://jktauber.com/2016/02/09/updated-solution-polytonic-greek-unicodes-problems</id>
    <content type="html" xml:base="https://jktauber.com/2016/02/09/updated-solution-polytonic-greek-unicodes-problems/">&lt;p&gt;In &lt;a href=&#34;/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/&#34;&gt;Polytonic Greek Unicode Still Isn’t Perfect&lt;/a&gt;, I enumerated various challenges that still exist with using Polytonic Greek when vowel length needs to be marked. I now have a better appreciation of what solutions are actually realistic.&lt;/p&gt;
&lt;p&gt;After discussions with people on the Unicode mailing list, it&#39;s clear the solution is NOT to add more precomposed character code points to Unicode (or rather, such a solution will never be adopted by Unicode). Rather, the solution likely lies in the tools just understanding grapheme clusters. For more background, see &lt;a href=&#34;http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries&#34;&gt;Grapheme Cluster Boundaries&lt;/a&gt; in the Unicode Standard Annex on Unicode Text Segmentation.&lt;/p&gt;
&lt;p&gt;Perl 6 already has support for this: a layer above code points representing what are considered single graphemes even if made up of multiple code points. See, for example, Jonathan Worthington&#39;s &lt;a href=&#34;http://jnthn.net/papers/2015-spw-nfg.pdf&#34;&gt;slides on Normal Form Grapheme&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So my plan is to at the very least implement a similar approach for Python 3 (unless someone else already has). That will still mean the problem has to separately be solved by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;font foundries&lt;/li&gt;
&lt;li&gt;text editor developers&lt;/li&gt;
&lt;li&gt;keyboard / input source software developers&lt;/li&gt;
&lt;li&gt;operating system developers&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I&#39;ll try to engage with each of these groups and will keep people posted on my progress.&lt;/p&gt;
&lt;p&gt;Thanks to Ken Whistler for making clear that the path forward is not in more precomposed characters but in working with system vendors and font foundries.&lt;/p&gt;
&lt;p&gt;Thanks to Markus Scherer and Elizabeth Mattijsen for their pointers to TR29 and the Perl 6 work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2016-12-04)&lt;/strong&gt;: Now see &lt;a href=&#34;/2016/12/04/diacritic-stacking-skolar-pe-fixed/&#34;&gt;Diacritic Stacking in Skolar PE Fixed&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In &lt;a href=&#34;/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/&#34;&gt;Polytonic Greek Unicode Still Isn’t Perfect&lt;/a&gt;, I enumerated various challenges that still exist with using Polytonic Greek when vowel length needs to be marked. I now have a better appreciation of what solutions are actually realistic.</summary>
  </entry><entry>
    <title type="html">Polytonic Greek Unicode Still Isn’t Perfect</title>
    <link href="https://jktauber.com/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/" rel="alternate" type="text/html" title="Polytonic Greek Unicode Still Isn’t Perfect"/>
    <published>2016-01-28</published>
    <updated>2016-01-28</updated>
    <id>https://jktauber.com/2016/01/28/polytonic-greek-unicode-is-still-not-perfect</id>
    <content type="html" xml:base="https://jktauber.com/2016/01/28/polytonic-greek-unicode-is-still-not-perfect/">&lt;p&gt;Whether we&#39;re talking about fonts, programming languages, keyboard entry or even the command-line, support for polytonic Greek has greatly improved even in the last 10 years much less the 23 years since I&#39;ve been doing computational analysis of Greek texts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2016-12-04): The Skolar examples in this post will no longer make sense as the issues have now been fixed. See &lt;a href=&#34;/2016/12/04/diacritic-stacking-skolar-pe-fixed/&#34;&gt;Diacritic Stacking in Skolar PE Fixed&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;With configurable input sources in OS X, it&#39;s easy to type polytonic Greek and the default fonts support all the Unicode codepoints for polytonic Greek. I can now just type Greek (rather than a transliteration or BetaCode) in data files or forum posts or emails or tweets or GitHub issues. There are still &lt;em&gt;some&lt;/em&gt; display issues with using polytonic Greek in fixed-width fonts but that&#39;s improving. Last year I talked about the bug I reported that got &lt;a href=&#34;/2015/11/02/atom-editor-11-fixes-polytonic-greek-bug/&#34;&gt;fixed in the Atom editor&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Python has long supported Unicode and Python 3 made it even easier to deal with text processing of Unicode files. It doesn&#39;t sort polytonic Greek correctly out of the box, but I wrote &lt;a href=&#34;https://github.com/jtauber/pyuca&#34;&gt;pyuca&lt;/a&gt; to solve that problem!&lt;/p&gt;
&lt;p&gt;The situation seemed almost perfect until I started doing a lot more work that required me to track vowel length and, in particular use a macron ˉ to distinguish long α, ι, and υ from short. It&#39;s okay when the macron is the only diacritic on a vowel: the problems start when a vowel has both an acute and a macron. (There is no need for a macron and a circumflex as the circumflex already implies the vowel is long. Same with an iota subscript.)&lt;/p&gt;
&lt;h3&gt;Problem 1: No precomposed character code points&lt;/h3&gt;
&lt;p&gt;ᾱ can be written as the decomposed &lt;code&gt;U+03B1 U+0304&lt;/code&gt; or the precomposed &lt;code&gt;U+1FB1&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; len(&#39;ᾱ&#39;)
1
&amp;gt;&amp;gt;&amp;gt; [hex(ord(ch)) for ch in &#39;ᾱ&#39;]
[&#39;0x1fb1&#39;]    
&amp;gt;&amp;gt;&amp;gt; [unicodedata.name(ch) for ch in &#39;ᾱ&#39;]
[&#39;GREEK SMALL LETTER ALPHA WITH MACRON&#39;]
&amp;gt;&amp;gt;&amp;gt; unicodedata.decomposition(&#39;ᾱ&#39;)
&#39;03B1 0304&#39;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;ά can be written as the decomposed &lt;code&gt;U+03B1 U+0301&lt;/code&gt; or the precomposed &lt;code&gt;U+03AC&lt;/code&gt; (assuming normalization to a tonos which the Greek Polytonic Input Source on OS X does):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; len(&#39;ά&#39;)
1
&amp;gt;&amp;gt;&amp;gt; [hex(ord(ch)) for ch in &#39;ά&#39;]
[&#39;0x3ac&#39;]
&amp;gt;&amp;gt;&amp;gt; [unicodedata.name(ch) for ch in &#39;ά&#39;]
[&#39;GREEK SMALL LETTER ALPHA WITH TONOS&#39;]
&amp;gt;&amp;gt;&amp;gt; unicodedata.decomposition(&#39;ά&#39;)
&#39;03B1 0301&#39;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But there&#39;s no precomposed character &lt;code&gt;ᾱ́&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; len(&#39;ᾱ́&#39;)
2
&amp;gt;&amp;gt;&amp;gt; [hex(ord(ch)) for ch in &#39;ᾱ́&#39;]
[&#39;0x1fb1&#39;, &#39;0x301&#39;]
&amp;gt;&amp;gt;&amp;gt; [hex(ord(ch)) for ch in unicodedata.normalize(&#39;NFC&#39;, &#39;ᾱ́&#39;)]
[&#39;0x1fb1&#39;, &#39;0x301&#39;]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As you can see, even Python 3 views &lt;code&gt;ᾱ́&lt;/code&gt; as two characters. This also screws up font metrics in many text editors and browser text areas (like the one I&#39;m writing this post in).&lt;/p&gt;
&lt;h3&gt;Problem 2: Many fonts with otherwise excellent polytonic Greek support don&#39;t display it properly&lt;/h3&gt;
&lt;p&gt;The Skolar PE font I use on this site can&#39;t properly display &lt;code&gt;ᾱ́&lt;/code&gt;. It displays it as ᾱ́. Ironically this is one time the fixed width fonts do a better job!&lt;/p&gt;
&lt;h3&gt;Problem 3: You can&#39;t normalize an alternative ordering of diacritics&lt;/h3&gt;
&lt;p&gt;If you already have a &lt;code&gt;GREEK SMALL LETTER ALPHA WITH TONOS&lt;/code&gt; and you add a &lt;code&gt;COMBINING MACRON&lt;/code&gt; you end up (at least in the fonts I&#39;ve tried) with something that even visually looks different from the &lt;code&gt;GREEK SMALL LETTER ALPHA WITH MACRON&lt;/code&gt; followed by &lt;code&gt;COMBINING ACUTE ACCENT&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; &amp;quot;\u03ac\u0304&amp;quot;
&#39;ά̄&#39;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;(Notice that &lt;code&gt;ά̄&lt;/code&gt; != &lt;code&gt;ᾱ́&lt;/code&gt; and oddly, Skolar PE does a better job of the former than the latter: ά̄ vs ᾱ́)&lt;/p&gt;
&lt;p&gt;And to make matters worse, you can&#39;t normalize one to the other:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;[hex(ord(ch)) for ch in unicodedata.normalize(&#39;NFC&#39;, &#39;\u03ac\u0304&#39;)]
[&#39;0x3ac&#39;, &#39;0x304&#39;]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You have to combine the components in the correct order with the macron FIRST:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; [hex(ord(ch)) for ch in unicodedata.normalize(&#39;NFC&#39;, &#39;\u03b1\u0304\u0301&#39;)]
[&#39;0x1fb1&#39;, &#39;0x301&#39;]
&amp;gt;&amp;gt;&amp;gt; [hex(ord(ch)) for ch in unicodedata.normalize(&#39;NFC&#39;, &#39;\u03b1\u0301\u0304&#39;)]
[&#39;0x3ac&#39;, &#39;0x304&#39;]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is not a bug: technically &lt;code&gt;ά̄&lt;/code&gt; and &lt;code&gt;ᾱ́&lt;/code&gt; are distinct graphemes but it&#39;s still an annoyance because it requires any code that adds diacritics to need to know the correct order in which to add them.&lt;/p&gt;
&lt;h3&gt;Problem 4: No support in the Greek Polytonic Input Source&lt;/h3&gt;
&lt;p&gt;The Greek Polytonic Input Source supports typing a digraph (diacritic then base) to produce precomposed characters but you can&#39;t use a trigraph to enter &lt;code&gt;ᾱ́&lt;/code&gt;. In fact, every time I&#39;ve needed to type &lt;code&gt;ᾱ́&lt;/code&gt; in this post, I&#39;ve needed to copy paste it from an earlier usage (and manually minted one via Python the first time).&lt;/p&gt;
&lt;h3&gt;Problem 5: My existing syllabification heuristics didn&#39;t work&lt;/h3&gt;
&lt;p&gt;I recently had to tweak the syllabification heuristics in my &lt;a href=&#34;https://github.com/jtauber/greek-accentuation&#34;&gt;greek-accentuation&lt;/a&gt; Python library to correctly syllabify words like &lt;code&gt;φῡ́ω&lt;/code&gt;. Prior to 0.9.4, it put a syllable division between the macron and the acute!&lt;/p&gt;
&lt;p&gt;This would have not happened if Unicode (and hence Python) treated &lt;code&gt;ῡ́&lt;/code&gt; as a single character.&lt;/p&gt;
&lt;h3&gt;Problem 6: There&#39;s also breathing&lt;/h3&gt;
&lt;p&gt;I thought I was all set after fixing Problem 5 but then I hit the imperfect of ἵστημι which starts in most cases with &lt;code&gt;ῑ́̔&lt;/code&gt;/&lt;code&gt;ῑ̔́&lt;/code&gt;  (yes, that should be a rough breathing and acute with a macron.) I&#39;m in the process of working around this problem in &lt;code&gt;greek-accentuation&lt;/code&gt; now.&lt;/p&gt;
&lt;h2&gt;The Solution&lt;/h2&gt;
&lt;p&gt;The root cause of all this is just that Unicode-based code can&#39;t treat &lt;code&gt;ῑ́̔&lt;/code&gt; or &lt;code&gt;ῡ́&lt;/code&gt; or &lt;code&gt;ᾱ́&lt;/code&gt; as single characters because Unicode doesn&#39;t have a codepoint for the precomposed characters. I imagine it&#39;s a long road to get the Unicode Consortium to &#34;fix&#34; this, if it&#39;s even possible. And even if some future version of Unicode fixed it; I&#39;d have to wait for Python and OS X to catch up before the problem really goes away. For now I&#39;ll just have to continue to work around the problem in code like my &lt;code&gt;greek-accentuation&lt;/code&gt; library. That still doesn&#39;t solve the problem with the Skolar PE fonts but I might be able to raise that issue with the font foundry.&lt;/p&gt;
&lt;p&gt;It&#39;s possible there are additional workarounds or tricks I&#39;m not aware of. If there are, please let me know.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CORRECTION&lt;/strong&gt;: Thanks to Tom Gewecke for pointing out an earlier misstatement about the Polytonic Greek Input Source on OS X producing combining characters. It does not. It supports digraphs to produce precomposed characters.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CORRECTION&lt;/strong&gt;: Thanks to Martin J. Dürst for pointing out that &lt;code&gt;ά̄&lt;/code&gt; and &lt;code&gt;ᾱ́&lt;/code&gt; are distinct graphemes and so the fact they aren&#39;t normalized to each other isn&#39;t a problem with Unicode as such.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: I remarked at the end of Problem 1 about font metrics in editors / text areas but really I should make that a separate problem. Related (and perhaps yet another problem) is selecting characters with multiple diacritics.&lt;/p&gt;
&lt;h2&gt;Updated Solution&lt;/h2&gt;
&lt;p&gt;Now see my later post: &lt;a href=&#34;/2016/02/09/updated-solution-polytonic-greek-unicodes-problems/&#34;&gt;An Updated Solution to Polytonic Greek Unicode’s Problems&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Whether we&#39;re talking about fonts, programming languages, keyboard entry or even the command-line, support for polytonic Greek has greatly improved even in the last 10 years much less the 23 years since I&#39;ve been doing computational analysis of Greek texts.</summary>
  </entry><entry>
    <title type="html">greek-utils 0.1 Released</title>
    <link href="https://jktauber.com/2016/01/18/greek-utils-01-released/" rel="alternate" type="text/html" title="greek-utils 0.1 Released"/>
    <published>2016-01-18</published>
    <updated>2016-01-18</updated>
    <id>https://jktauber.com/2016/01/18/greek-utils-01-released</id>
    <content type="html" xml:base="https://jktauber.com/2016/01/18/greek-utils-01-released/">&lt;p&gt;While I write and release a lot of Python code for working with Ancient Greek, it tends to be either throwaway code for data wrangling or fairly specialized code for things like accentuation or inflectional morphology.&lt;/p&gt;
&lt;p&gt;I decided there needed to be a place to put lightweight utilities that can be used by a range of different projects. This is the motivation for &lt;code&gt;greek-utils&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The initial 0.1 release of &lt;code&gt;greek-utils&lt;/code&gt; just provides the following features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Convert BetaCode to Unicode&lt;/li&gt;
&lt;li&gt;Turn an iterable into a generator over trigrams&lt;/li&gt;
&lt;li&gt;A Trie datastructure&lt;/li&gt;
&lt;li&gt;MorphGNT BCV string to human-readable verse reference&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;greek-utils&lt;/code&gt; is pip installable and the repo is at&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jtauber/greek-utils&#34;&gt;https://github.com/jtauber/greek-utils&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Full documentation is included there.&lt;/p&gt;
&lt;p&gt;I&#39;ll be moving a lot more out of gists and individual project repos over the coming months.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">While I write and release a lot of Python code for working with Ancient Greek, it tends to be either throwaway code for data wrangling or fairly specialized code for things like accentuation or inflectional morphology.</summary>
  </entry><entry>
    <title type="html">Direct Speech Capitalization and the First Preceding Head</title>
    <link href="https://jktauber.com/2016/01/17/direct-speech-capitalization-first-preceding-head/" rel="alternate" type="text/html" title="Direct Speech Capitalization and the First Preceding Head"/>
    <published>2016-01-17</published>
    <updated>2016-01-17</updated>
    <id>https://jktauber.com/2016/01/17/direct-speech-capitalization-first-preceding-head</id>
    <content type="html" xml:base="https://jktauber.com/2016/01/17/direct-speech-capitalization-first-preceding-head/">&lt;p&gt;As part of my explicit annotation of the normalization column in MorphGNT, I started down the rabbit hole of capitalization conventions which led to an interesting experiment with direct speech and the GBI syntax trees.&lt;/p&gt;
&lt;p&gt;Back in &lt;a href=&#34;/2015/11/27/annotating-normalization-column-morphgnt-part-1/&#34;&gt;Annotating the Normalization Column in MorphGNT: Part 1&lt;/a&gt;, I talked about wanting to catalogue the reasons why a word in the text differs from the normalized form, and annotate the text on a per-case basis. One difference mentioned was capitalization.&lt;/p&gt;
&lt;p&gt;In Greek texts printed now-a-days, there are three reasons why a word might start with an uppercase letter:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;it&#39;s a proper noun&lt;/li&gt;
&lt;li&gt;it&#39;s the start of a paragraph&lt;/li&gt;
&lt;li&gt;it&#39;s the start of direct speech&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So I obviously want to be able to explictly say in each case, which it is (of course it could be more than one or even all three, potentially).&lt;/p&gt;
&lt;p&gt;The heuristic for the proper nouns is easy if you actually have tagged the proper nouns or lemmatized the text (although there are some inconsistencies as I&#39;ve already mentioned which need to get cleaned up in MorphGNT).&lt;/p&gt;
&lt;p&gt;The start of a paragraph heuristic should be straight forward as the electronic SBLGNT text has paragraphs indicated but there are some oddities I&#39;m looking at (including 30 cases where a word after a paragraph break is not capitalized, some of which are inconsistencies in SBLGNT itself).&lt;/p&gt;
&lt;p&gt;The direct speech is most interesting. I started by assuming that, if the lemma isn&#39;t capitalized and the word isn&#39;t at the start of a paragraph, it must be the start of direct speech. There are 2,225 cases of this in the SBLGNT text underlying the MorphGNT.&lt;/p&gt;
&lt;p&gt;Then I implemented a little heuristic where I traversed up the heads from the start of the direct speech (using the dependency version of the GBI Syntax Trees) until hitting a word that preceded the direct speech. Let&#39;s call that the &lt;strong&gt;first preceding head&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;My hypothesis was that the first preceding head would be some verb of communication (saying, writing, etc). In theory one might also expect a complementizer but the GBI Syntax Trees don&#39;t treat complementizers as heads so they don&#39;t come up in practice.&lt;/p&gt;
&lt;p&gt;In 1,641 instances, the first preceding head was a form of λέγω. In much rarer instances (no lexeme with more than 64 instances) there were other verbs like γράφω, ἀποκρίνομαι, φημί, ἐπερωτάω, or κράζω.&lt;/p&gt;
&lt;p&gt;In some cases the first preceding head was clearly not a verb of communication (and often not a verb at all). Going through the first half of Matthew so far, here are the explanations I&#39;ve discovered:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in Matt 6.31, three instances of direct speech are disjoined and the GBI Trees model disjunction in such a way the second and third instance are linked to the first rather than the actual verb of communication, λέγοντες&lt;/li&gt;
&lt;li&gt;in Matt 8.9, the verb of communication is elided in the second and third cases so the GBI Tree attaches the direct speech elsewhere&lt;/li&gt;
&lt;li&gt;Matt 9.13 has &#34;μάθετε τί ἐστιν&#34; and Matt 12.7 has &#34;εἰ δὲ ἐγνώκειτε τί ἐστιν&#34; and the GBI Trees end up hanging the direct speech (or &#34;meaning&#34;) off τί&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There were 118 cases in the entire text where there was no first preceding head. Going through the first half of Matthew again, the majority of these are cases where there is no direct speech but a word has been capitalized without an actual paragraph break. However, there are a couple of other interesting scenarions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;in Matt 11.21, we might expect ἤρξατο ὀνειδίζειν to be linked to the direct speech with a participle of saying but none is provided&lt;/li&gt;
&lt;li&gt;similarly in Matt 13.33, there is direct speech but no participle linking to ἐλάλησεν&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;My plan is to go through the rest of the text and describe all the scenarios, but as this is somewhat of an unexpected rabbit hole, it might take me a while.&lt;/p&gt;
&lt;p&gt;If anyone is interested in a raw dump of the data with my explanations (covered above) so far, see &lt;a href=&#34;https://gist.github.com/jtauber/39d85cff34c71a2df169&#34;&gt;https://gist.github.com/jtauber/39d85cff34c71a2df169&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">As part of my explicit annotation of the normalization column in MorphGNT, I started down the rabbit hole of capitalization conventions which led to an interesting experiment with direct speech and the GBI syntax trees.</summary>
  </entry><entry>
    <title type="html">MorphGNT 6.07 Released</title>
    <link href="https://jktauber.com/2016/01/16/morphgnt-607-released/" rel="alternate" type="text/html" title="MorphGNT 6.07 Released"/>
    <published>2016-01-16</published>
    <updated>2016-01-16</updated>
    <id>https://jktauber.com/2016/01/16/morphgnt-607-released</id>
    <content type="html" xml:base="https://jktauber.com/2016/01/16/morphgnt-607-released/">&lt;p&gt;The latest release of MorphGNT (with a corresponding release of the Python library py-sblgnt) fixes some lemmatization issues along with a couple of accent and part-of-speech changes.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;use acute at end of sentence in Luke 10.38&lt;/li&gt;
&lt;li&gt;use ἄγω as lemma of ἄγε per issue #39&lt;/li&gt;
&lt;li&gt;use ἱερός lemma in all situations per issue #36&lt;/li&gt;
&lt;li&gt;fix accent in συνίημι lemma in Acts 28.26 per issue #37&lt;/li&gt;
&lt;li&gt;fixed θαρσέω lemmas where forms use ρσ as well per issue #38&lt;/li&gt;
&lt;li&gt;fixed προώρισε(ν) lemma in Acts 4.28 per issue #40&lt;/li&gt;
&lt;li&gt;elaborated on part of speech and parsing codes in README&lt;/li&gt;
&lt;li&gt;corrected lemmatization of ἤρχοντο in John 4.30 per issue #41&lt;/li&gt;
&lt;li&gt;changed μακράν to adverb when lemma is μακράν per issue #33&lt;/li&gt;
&lt;li&gt;changed lemma for ἔδει to δέω per issue #24&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Thanks Scott Fleischmann, Ulrik Sandborg-Petersen and Emma Ehrhardt.&lt;/p&gt;
&lt;p&gt;MorphGNT is available at &lt;a href=&#34;https://github.com/morphgnt/sblgnt&#34;&gt;https://github.com/morphgnt/sblgnt&lt;/a&gt; and all issues should be filed there.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/morphgnt/py-sblgnt&#34;&gt;py-sblgnt&lt;/a&gt; 0.5 is now available on PyPI for those wanting to access MorphGNT via a pip-installable Python API.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">The latest release of MorphGNT (with a corresponding release of the Python library py-sblgnt) fixes some lemmatization issues along with a couple of accent and part-of-speech changes.</summary>
  </entry><entry>
    <title type="html">Gouin on Language Learning</title>
    <link href="https://jktauber.com/2016/01/13/gouin-language-learning/" rel="alternate" type="text/html" title="Gouin on Language Learning"/>
    <published>2016-01-13</published>
    <updated>2016-01-13</updated>
    <id>https://jktauber.com/2016/01/13/gouin-language-learning</id>
    <content type="html" xml:base="https://jktauber.com/2016/01/13/gouin-language-learning/">&lt;p&gt;I recently found out about François Gouin, a sort of proto-Charles Berlitz who wrote (in French) a book called &lt;em&gt;The art of teaching and studying languages&lt;/em&gt;, published in 1880 and then translated and published in English in 1892.&lt;/p&gt;
&lt;p&gt;I&#39;ve only skimmed the book so far but it looks like it contains some real gems relating to the teaching of Greek.&lt;/p&gt;
&lt;p&gt;Gouin was a classics professor who attempted to learn German initially using the grammar-translation method used for Latin and Greek. The beginning of the book recounts what an utter failure it was and it&#39;s quite an amusing read with section headings such as &#34;An attempt at conversation—Disgust and fatigue—Reading and translation, their worthlessness demonstrated&#34;.&lt;/p&gt;
&lt;p&gt;After observing three-year-olds playing with language, the light went off and he developed his Series Method, described in the bulk of the rest of the book.&lt;/p&gt;
&lt;p&gt;He ends the book discussing implications of his findings for the teaching of Greek and Latin. Again, I haven&#39;t read in detail but I did enjoy his scathing remarks about the uselessness of dictionaries for learning a language and his bafflement at the fact students can spend 12 years learning Latin at school and still know nowhere near what someone learning German for six months under his method would know.&lt;/p&gt;
&lt;p&gt;If you&#39;re interested in the history of second language teaching with particular reference to Latin and Greek, the book might be worth checking out. It&#39;s available at &lt;a href=&#34;https://archive.org/details/artofteachingstu00gouirich&#34;&gt;Internet Archive&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I recently found out about François Gouin, a sort of proto-Charles Berlitz who wrote (in French) a book called &lt;em&gt;The art of teaching and studying languages&lt;/em&gt;, published in 1880 and then translated and published in English in 1892.</summary>
  </entry><entry>
    <title type="html">Off to the Linguistic Society of America’s 90th Annual Meeting</title>
    <link href="https://jktauber.com/2016/01/06/linguistic-society-americas-90th-annual-meeting/" rel="alternate" type="text/html" title="Off to the Linguistic Society of America’s 90th Annual Meeting"/>
    <published>2016-01-06</published>
    <updated>2016-01-06</updated>
    <id>https://jktauber.com/2016/01/06/linguistic-society-americas-90th-annual-meeting</id>
    <content type="html" xml:base="https://jktauber.com/2016/01/06/linguistic-society-americas-90th-annual-meeting/">&lt;p&gt;I&#39;m heading off to the LSA&#39;s annual meeting for the first time.&lt;/p&gt;
&lt;p&gt;This morning my twitter timeline was filled with classicists heading off to the SCS annual meeting (okay, maybe not filled, but there were three or four). I must follow more classicists than linguists because I didn&#39;t see anyone tweeting about heading off to Washington DC for the LSA annual meeting.&lt;/p&gt;
&lt;p&gt;The fact they are on at the same time on different sides of the country doesn&#39;t exactly help cross-disciplinary collaboration and for a brief moment I wondered which to go to. It was actually an easy choice. I&#39;m far more of a linguist than a classicist, even though most of my linguistics for the last twenty two years has been Ancient Greek related. A quick look at the programmes of each conference reassured me I&#39;d made the right decision.&lt;/p&gt;
&lt;p&gt;I don&#39;t yet know if anyone I personally know will be there, which always makes conferences awkward for me. I&#39;m also sitting an exam being proctored at a local university on Monday which I need to spend a decent amount of time studying for.&lt;/p&gt;
&lt;p&gt;That exam is actually the main reason I haven&#39;t blogged much since SBL. That will hopefully change next week when I&#39;m done!&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;m heading off to the LSA&#39;s annual meeting for the first time.</summary>
  </entry><entry>
    <title type="html">A (Not So) New Numbering System for Greek New Testament Lexemes</title>
    <link href="https://jktauber.com/2015/12/15/new-numbering-system-greek-new-testament-lexemes/" rel="alternate" type="text/html" title="A (Not So) New Numbering System for Greek New Testament Lexemes"/>
    <published>2015-12-15T11:40:13</published>
    <updated>2015-12-15T11:40:13</updated>
    <id>https://jktauber.com/2015/12/15/new-numbering-system-greek-new-testament-lexemes</id>
    <content type="html" xml:base="https://jktauber.com/2015/12/15/new-numbering-system-greek-new-testament-lexemes/">&lt;p&gt;Ten years ago, when Ulrik Sandborg-Petersen and I started collaborating, we came up with a way of referencing lexemes that would satisfy both the lumpers and splitters. At the time we wrote a paper that we circulated to a small audience but now it&#39;s finally up on Academia.edu.&lt;/p&gt;
&lt;p&gt;The 2006 unpublished paper is entitled &lt;a href=&#34;https://www.academia.edu/19660777/A_New_Numbering_System_for_Greek_New_Testament_Lexemes_2006_&#34;&gt;A New Numbering System for Greek New Testament Lexemes&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here&#39;s the abstract:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Numbering systems (such as Strong’s) are a popular way to reference the lexemes of the Greek New Testament corpus but a straight enumeration is not without problems, particularly when there is disagreement about whether two forms are the same lexeme or not. We present a way of referencing lexemes that allows competing viewpoints to be represented simultaneously. Existing numbering systems can be mapped into this new system without any loss of granularity and new analyses can be expressed without violating the integrity of existing references into the system.&lt;/p&gt;
&lt;/blockquote&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Ten years ago, when Ulrik Sandborg-Petersen and I started collaborating, we came up with a way of referencing lexemes that would satisfy both the lumpers and splitters. At the time we wrote a paper that we circulated to a small audience but now it&#39;s finally up on Academia.edu.</summary>
  </entry><entry>
    <title type="html">Functional Dependency in the MorphGNT Table</title>
    <link href="https://jktauber.com/2015/12/15/functional-dependency-morphgnt-table/" rel="alternate" type="text/html" title="Functional Dependency in the MorphGNT Table"/>
    <published>2015-12-15T17:06:47</published>
    <updated>2015-12-15T17:06:47</updated>
    <id>https://jktauber.com/2015/12/15/functional-dependency-morphgnt-table</id>
    <content type="html" xml:base="https://jktauber.com/2015/12/15/functional-dependency-morphgnt-table/">&lt;p&gt;Often it&#39;s useful to see whether certain columns in a table can be entirely determined by others. For example, can you unambigously get the lemma from just the form (the answer is no so a more useful question is which forms are ambiguous as to lemma)? Does knowing the part-of-speech help? Here we provide some code and give some examples.&lt;/p&gt;
&lt;p&gt;At the end I provide the script used.&lt;/p&gt;
&lt;p&gt;Run in the same directory as the MorphGNT SBLGNT, it runs like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;$ ./dep.py 6 7
45
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What this is telling us is that there are 45 times where the value of column 6 (the normalized form) gives us &lt;em&gt;multiple&lt;/em&gt; possible values for column 7 (the lemma). In relational database terms was say that column 7 is not &lt;strong&gt;functionally dependendent&lt;/strong&gt; on or not &lt;strong&gt;functionally determined&lt;/strong&gt; by column 6 because of those 45 cases.&lt;/p&gt;
&lt;p&gt;If you run:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;$ ./dep.py -v 6 7
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;it will actually list all 45, starting with something like:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;ἄμωμον {&#39;ἄμωμος&#39;, &#39;ἄμωμον&#39;}
ἴδε {&#39;ἴδε&#39;, &#39;ὁράω&#39;}
ὑποταγῇ {&#39;ὑποταγή&#39;, &#39;ὑποτάσσω&#39;}
καλῶν {&#39;καλός&#39;, &#39;καλέω&#39;}
Ἰουδαίας {&#39;Ἰουδαῖος&#39;, &#39;Ἰουδαία&#39;}
...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can also give more than one column for either the determinant or dependent.&lt;/p&gt;
&lt;p&gt;For example, does knowing the form AND part-of-speech determine the lemma?&lt;/p&gt;
&lt;p&gt;Turns out there are only 8 exceptions in the current MorphGNT/SBLGNT:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;$ ./dep.py -v 6,2 7
Ἅννα N- {&#39;Ἅννα&#39;, &#39;Ἅννας&#39;}
ἀνώτερον A- {&#39;ἀνώτερος&#39;, &#39;ἀνώτερον&#39;}
ἀλάβαστρον N- {&#39;ἀλάβαστρος&#39;, &#39;ἀλάβαστρον&#39;}
χρυσᾶ A- {&#39;χρύσεος&#39;, &#39;χρυσοῦς&#39;}
μακράν A- {&#39;μακράν&#39;, &#39;μακρός&#39;}
ὕστερον A- {&#39;ὕστερον&#39;, &#39;ὕστερος&#39;}
ταχύ A- {&#39;ταχύ&#39;, &#39;ταχύς&#39;}
ἤρχοντο V- {&#39;ἄρχω&#39;, &#39;ἔρχομαι&#39;}
8
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There are other things that can be explored with this. How many lemmas have more than one part-of-speech in the MorphGNT/SBLGNT?&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;$ ./dep.py 7 2
70
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;How many forms have more than one parse analysis extant in the text, even if you know the lemma and part-of-speech:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;$ ./dep.py 6,7,2 3
903
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Given a lemma, part-of-speech and parse analysis, how many cases are there where multiple alternative forms are seen:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;$ ./dep.py 7,2,3 6
132
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Looking at these with the &lt;code&gt;-v&lt;/code&gt; option, you can see some are unavoidable:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;ὁράω V- 1AAI-P-- {&#39;εἴδομεν&#39;, &#39;εἴδαμεν&#39;}
κλείς N- ----APF- {&#39;κλεῖς&#39;, &#39;κλεῖδας&#39;}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;whereas others are likely corrections that need to be made to the lemmatization:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;τις RI ----GSM- {&#39;τινος&#39;, &#39;τινός&#39;}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The most recent set of corrections to MorphGNT/SBLGNT (which will be in release 6.07) stem from this sort of analysis.&lt;/p&gt;
&lt;p&gt;There are still more to discuss and resolve, however. See &lt;a href=&#34;https://github.com/morphgnt/sblgnt/issues/32&#34;&gt;https://github.com/morphgnt/sblgnt/issues/32&lt;/a&gt; and other issues on GitHub for details and to help in the discussion.&lt;/p&gt;
&lt;h2&gt;The script&lt;/h2&gt;
&lt;div&gt;
&lt;script src=&#34;https://gist.github.com/jtauber/ab691a5552d97a8c40c2.js&#34;&gt;&lt;/script&gt;
&lt;/div&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Often it&#39;s useful to see whether certain columns in a table can be entirely determined by others. For example, can you unambigously get the lemma from just the form (the answer is no so a more useful question is which forms are ambiguous as to lemma)? Does knowing the part-of-speech help? Here we provide some code and give some examples.</summary>
  </entry><entry>
    <title type="html">Annotating the Normalization Column in MorphGNT: Part 1</title>
    <link href="https://jktauber.com/2015/11/27/annotating-normalization-column-morphgnt-part-1/" rel="alternate" type="text/html" title="Annotating the Normalization Column in MorphGNT: Part 1"/>
    <published>2015-11-27</published>
    <updated>2015-11-27</updated>
    <id>https://jktauber.com/2015/11/27/annotating-normalization-column-morphgnt-part-1</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/27/annotating-normalization-column-morphgnt-part-1/">&lt;p&gt;Since the Series-6 release, MorphGNT has had a column that normalizes the word forms in the text for contextual things like accent changes, elision, movable nu and capitalization. I thought it would be useful to provide an annotation of exactly what normalization had been done for each word in the text and why.&lt;/p&gt;
&lt;p&gt;I wrote a short Python script that runs some heuristics on each case where the &#34;word&#34; column and &#34;norm&#34; column differ to determine the nature of the in-context change.&lt;/p&gt;
&lt;p&gt;In this post, I&#39;ll just report on some statistics. In later posts, I&#39;ll dive into further details that rely on actually looking at the surrounding context (rather than just the difference in one row).&lt;/p&gt;
&lt;p&gt;There are 47,630 times where the word and norm columns differ.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;38,523&lt;/strong&gt; times there is a &lt;strong&gt;change of accent&lt;/strong&gt; (clitics, oxytones taking graves, etc).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3,721&lt;/strong&gt; times there is a &lt;strong&gt;change in capitalization&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1,221&lt;/strong&gt; times there is &lt;strong&gt;elision&lt;/strong&gt;: 984 times a straight dropping of a final vowel, 237 times an additional aspiration of the preceding consonant.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5,223&lt;/strong&gt; times there is a &lt;strong&gt;movable nu&lt;/strong&gt;. Note that both the existence and absence of nu is normalized to &lt;code&gt;(ν)&lt;/code&gt; so this covers all cases where a nu &lt;em&gt;could&lt;/em&gt; be dropped as well as the 142 times when it actually is.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;226 times&lt;/strong&gt; there is a &lt;strong&gt;movable sigma&lt;/strong&gt; (20 times where it&#39;s actually dropped). This doesn&#39;t count ἐξ (another 234 times). There are also 825 times οὐκ appears and 105 times οὐχ appears.&lt;/p&gt;
&lt;p&gt;In addition to the 47,630 cases above, there are also 32 other instances of two types of discrepancy that need to be resolved. One is ἑλπίδι with a rough accent in Romans. The other is the cases where Χριστός appears with lower case χ. I&#39;m not sure what the solution to the former is but the latter might just involve having two distinct lemmata for Χριστός vs χριστός.&lt;/p&gt;
&lt;p&gt;All these statistics might seem of trivial interest but they are side effects of a more important task of both verifying the normalization and, as will be covered in subsequent posts, testing context-sensitive accentuation rules.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Since the Series-6 release, MorphGNT has had a column that normalizes the word forms in the text for contextual things like accent changes, elision, movable nu and capitalization. I thought it would be useful to provide an annotation of exactly what normalization had been done for each word in the text and why.</summary>
  </entry><entry>
    <title type="html">Back to a More Sustainable Blogging Pace</title>
    <link href="https://jktauber.com/2015/11/23/back-more-sustainable-blogging-pace/" rel="alternate" type="text/html" title="Back to a More Sustainable Blogging Pace"/>
    <published>2015-11-23</published>
    <updated>2015-11-23</updated>
    <id>https://jktauber.com/2015/11/23/back-more-sustainable-blogging-pace</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/23/back-more-sustainable-blogging-pace/">&lt;p&gt;Well, I did it! I blogged a post for every day in the four weeks leading up to my talk at SBL. It was a fantastic motivator but I can&#39;t sustain the pace.&lt;/p&gt;
&lt;p&gt;I&#39;ll try to at least blog once a week with a substantial post at least once a month but we&#39;ll see.&lt;/p&gt;
&lt;p&gt;There&#39;ll hopefully be a lot of ongoing progress to report but I&#39;ll also try to occasionally step back and write some more well-thought-out pieces, particularly on general linguistics. For thoughts-in-progress, I&#39;ll likely use &lt;a href=&#34;https://thoughtstreams.io/&#34;&gt;ThoughtStreams&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I&#39;m really hoping to collaborate with others on all the work I&#39;ve been talking about over the last four weeks and in my SBL talk, so if you&#39;re interested, email me at &lt;strong&gt;jtauber@jtauber.com&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;And because blogging won&#39;t be as regular, please subscribe to get email updates if you haven&#39;t already. Just fill out your email address in the form to the right (if you&#39;re on the site).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Well, I did it! I blogged a post for every day in the four weeks leading up to my talk at SBL. It was a fantastic motivator but I can&#39;t sustain the pace.</summary>
  </entry><entry>
    <title type="html">A Morphological Lexicon of New Testament Greek: My SBL 2015 Slides</title>
    <link href="https://jktauber.com/2015/11/22/morphological-lexicon-new-testament-greek-slides/" rel="alternate" type="text/html" title="A Morphological Lexicon of New Testament Greek: My SBL 2015 Slides"/>
    <published>2015-11-22</published>
    <updated>2015-11-22</updated>
    <id>https://jktauber.com/2015/11/22/morphological-lexicon-new-testament-greek-slides</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/22/morphological-lexicon-new-testament-greek-slides/">&lt;p&gt;This morning I gave my talk at SBL 2015 on my &lt;em&gt;Morphological Lexicon&lt;/em&gt; project.&lt;/p&gt;
&lt;p&gt;I&#39;ve put the slides up &lt;a href=&#34;https://www.academia.edu/18816954/A_Morphological_Lexicon_of_New_Testament_Greek&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This morning I gave my talk at SBL 2015 on my &lt;em&gt;Morphological Lexicon&lt;/em&gt; project.</summary>
  </entry><entry>
    <title type="html">Analyzing Verbal Morphology: Part 1</title>
    <link href="https://jktauber.com/2015/11/21/analyzing-verbal-morphology-part-1/" rel="alternate" type="text/html" title="Analyzing Verbal Morphology: Part 1"/>
    <published>2015-11-21</published>
    <updated>2015-11-21</updated>
    <id>https://jktauber.com/2015/11/21/analyzing-verbal-morphology-part-1</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/21/analyzing-verbal-morphology-part-1/">&lt;p&gt;In anticipation of my SBL talk tomorrow, here&#39;s an update on my verbal analysis.&lt;/p&gt;
&lt;p&gt;As I mentioned in &lt;a href=&#34;/2015/11/12/analyzing-nominal-morphology-part-1/&#34;&gt;Analyzing Nominal Morphology: Part 1&lt;/a&gt;, I started off with nominal morphology but, the last couple of years have been more focused on the verb (until a couple of months ago when I switched back to the noun).&lt;/p&gt;
&lt;p&gt;My current modeling approach is actually my third attempt at verbs. Perhaps in a later post I&#39;ll describe the earlier approaches and why I backed out and started from scratch twice. I&#39;m happy with the path I&#39;m following now, though.&lt;/p&gt;
&lt;p&gt;Unlike the approach I took later with nouns, my verb analysis didn&#39;t focus on theme/distinguisher but on stem/suffix with sandhi rules. One reason for this is one of my immediate goals was stem generation.&lt;/p&gt;
&lt;p&gt;Prior to running on all the MorphGNT verbs, I started with Helma Dik&#39;s &lt;em&gt;Nifty Greek Handouts&lt;/em&gt; and the verb paradigms in Louise Pratt&#39;s &lt;em&gt;The Essentials of Greek Grammar&lt;/em&gt;. Coverage is now those plus all the MorphGNT verbs except for imperatives, subjunctives and optatives.&lt;/p&gt;
&lt;p&gt;The code and data is currently available at &lt;a href=&#34;https://github.com/jtauber/greek-inflection&#34;&gt;https://github.com/jtauber/greek-inflection&lt;/a&gt; although I may move at least the GNT-specific data to be in the &lt;code&gt;morphological-lexicon&lt;/code&gt; repo soon.&lt;/p&gt;
&lt;p&gt;The basic approach is to have an &#34;endings&#34; database and a &#34;stems&#34; database. The &#34;endings&#34; database looks like:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;PAI.1S:
    - &amp;quot;|&amp;gt;ω&amp;lt;ω|&amp;quot;
    - &amp;quot;|ε&amp;gt;ῶ&amp;lt;ω|&amp;quot;
    - &amp;quot;|ο&amp;gt;ῶ&amp;lt;ω|&amp;quot;
    - &amp;quot;|α&amp;gt;ῶ&amp;lt;ω|&amp;quot;
    - &amp;quot;|ο!&amp;gt;ω&amp;lt;_1|μι&amp;quot;
    - &amp;quot;|ε!&amp;gt;η&amp;lt;_1|μι&amp;quot;
    - &amp;quot;|υ!&amp;gt;υ&amp;lt;_1|μι&amp;quot;
    - &amp;quot;|α!&amp;gt;η&amp;lt;_1|μι&amp;quot;
    - &amp;quot;|ει!&amp;gt;ει&amp;lt;_1|μι&amp;quot;

AAI.1S:
    - &amp;quot;|&amp;gt;&amp;lt;|α&amp;quot;
    - &amp;quot;|%&amp;gt;ο&amp;lt;T_1|ν&amp;quot;
    - &amp;quot;|α^&amp;gt;η&amp;lt;_1|ν&amp;quot;
    - &amp;quot;|ε^&amp;gt;η&amp;lt;_1|ν&amp;quot;
    - &amp;quot;|ο^&amp;gt;ω&amp;lt;_1|ν&amp;quot;
    - &amp;quot;|α!&amp;gt;η&amp;lt;_1|ν&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;where endings and sandhi are expressed. You can see various stem diacritics like &lt;code&gt;!&lt;/code&gt; for athematic, &lt;code&gt;^&lt;/code&gt; for root aorists and &lt;code&gt;%&lt;/code&gt; for second aorists. &lt;code&gt;T_1&lt;/code&gt; represents a thematic vowel and &lt;code&gt;_1&lt;/code&gt; a particular ablaut pattern.&lt;/p&gt;
&lt;p&gt;Along side this is a larger stem database:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;ἀγαπάω:
    stems:
        1-: ἀγαπα
        1+: ἠγαπα
        2-: ἀγαπησ
        3-: ἀγαπησ
        3+: ἠγαπησ
        4-: ἠγαπηκ
        5-: ἠγαπη
        7-: ἀγαπηθησ
ἀναλαμβάνω:
    compound: ἀνά++λαμβάνω
    stems:
        1-: ἀναλαμβαν
        3-: ἀναλαβ%
        3+: ἀνελαβ%
        6-: ἀναλημφθ
        6+: ἀνελημφθ
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Stems are keys by a principal-part like scheme where &lt;code&gt;-&lt;/code&gt; / &lt;code&gt;+&lt;/code&gt; refers to augmented and unaugmented. The &lt;code&gt;7-&lt;/code&gt; stem is the future perfect.&lt;/p&gt;
&lt;p&gt;The stem database can also do overrides for individual paradigm cells, show preverbs, mark enclitics and more.&lt;/p&gt;
&lt;p&gt;All this gets tested against the Dik and Pratt examples and the verb forms in the MorphGNT in two ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;given a lemma and features, is the correct form generated?&lt;/li&gt;
&lt;li&gt;given a form, lemma and features, is the correct stem identified?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once the imperatives, subjunctives and optatives are done, I&#39;ll work on stem relationships, essentially treating the stems as another paradigm. I may also at some point generate distinguishers for each verb form (within a particular aspect/tense-voice form).&lt;/p&gt;
&lt;p&gt;Further work will involve using it to actually analyze new texts, particularly handling the case where the stem is not yet in the stem database.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In anticipation of my SBL talk tomorrow, here&#39;s an update on my verbal analysis.</summary>
  </entry><entry>
    <title type="html">Greek Accentuation Library</title>
    <link href="https://jktauber.com/2015/11/20/greek-accentuation-library/" rel="alternate" type="text/html" title="Greek Accentuation Library"/>
    <published>2015-11-20</published>
    <updated>2015-11-20</updated>
    <id>https://jktauber.com/2015/11/20/greek-accentuation-library</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/20/greek-accentuation-library/">&lt;p&gt;I knew that a necessary component of a comprehensive morphological analyzer for Ancient Greek was going to be a library for handling accentuation, so back in January 2014, I started the &lt;code&gt;greek-accentuation&lt;/code&gt; Python library.&lt;/p&gt;
&lt;p&gt;It consists of three modules:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;characters&lt;/li&gt;
&lt;li&gt;syllabify&lt;/li&gt;
&lt;li&gt;accentuation&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;strong&gt;characters&lt;/strong&gt; module provides basic analysis and manipulation of Greek characters in terms of their Unicode diacritics as if decomposed. So you can use it to add, remove or test for breathing, accents, iota subscript or length diacritics.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; base(&#39;ᾳ&#39;)
&#39;α&#39;

&amp;gt;&amp;gt;&amp;gt; iota_subscript(&#39;ᾳ&#39;) == IOTA_SUBSCRIPT
True

&amp;gt;&amp;gt;&amp;gt; add_diacritic(&#39;α&#39;, IOTA_SUBSCRIPT)
&#39;ᾳ&#39;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;strong&gt;syllabify&lt;/strong&gt; module provides basic analysis and manipulation of Greek syllables. It can syllabify words, give you the onset, nucleus, code, rime or body of a syllable, judge syllable length or give you the accentuation class of word.&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; syllabify(&#39;γυναικός&#39;)
[&#39;γυ&#39;, &#39;ναι&#39;, &#39;κός&#39;]

&amp;gt;&amp;gt;&amp;gt; penult(&#39;οἰκία&#39;)
&#39;κί&#39;

&amp;gt;&amp;gt;&amp;gt; paroxytone(&#39;λόγος&#39;)
True
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;strong&gt;accentuation&lt;/strong&gt; module uses the other two modules to accentuate Ancient Greek words. As well as listing &lt;code&gt;possible_accentuations&lt;/code&gt; for a given unaccented word, it can produce &lt;code&gt;recessive&lt;/code&gt; and (given another form with an accent) &lt;code&gt;persistent&lt;/code&gt; accentuations.&lt;/p&gt;
&lt;p&gt;The library is open source under an MIT license. You can get the package on PyPI and the source repo is &lt;a href=&#34;https://github.com/jtauber/greek-accentuation&#34;&gt;https://github.com/jtauber/greek-accentuation&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I knew that a necessary component of a comprehensive morphological analyzer for Ancient Greek was going to be a library for handling accentuation, so back in January 2014, I started the &lt;code&gt;greek-accentuation&lt;/code&gt; Python library.</summary>
  </entry><entry>
    <title type="html">The Dangers of Reconstructing Too Much Morphophonology</title>
    <link href="https://jktauber.com/2015/11/19/dangers-reconstructing-too-much-morphophonology/" rel="alternate" type="text/html" title="The Dangers of Reconstructing Too Much Morphophonology"/>
    <published>2015-11-19</published>
    <updated>2015-11-19</updated>
    <id>https://jktauber.com/2015/11/19/dangers-reconstructing-too-much-morphophonology</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/19/dangers-reconstructing-too-much-morphophonology/">&lt;p&gt;What is the genitive singular ending for 2nd declension nouns?&lt;/p&gt;
&lt;p&gt;The beginner student probably thinks the ending is ου.&lt;/p&gt;
&lt;p&gt;Those that are told the &lt;em&gt;stem&lt;/em&gt; ends in ο might be tempted to conclude the actual ending is υ. At least one popular introductory text teaches this but it&#39;s incorrect.&lt;/p&gt;
&lt;p&gt;Those more familiar with the sandhi rules might conclude the ου &lt;em&gt;could&lt;/em&gt; come from ο+σο or ε+σο via οο. Those who know some Homer might speculate an ο+ιο, but ου is also found in Homer (especially in the pronouns) which might seem confusing.&lt;/p&gt;
&lt;p&gt;Those who study proto-Indo-European might know of &lt;em&gt;*osyo&lt;/em&gt; becoming &lt;em&gt;*ohyo&lt;/em&gt; in Proto-Greek then &lt;em&gt;*oyyo&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;How should this be modeled synchronically? I think there&#39;s too much of a tendency in morphophonology to adopt an &#34;ontogeny recapitulates phylogeny&#34; approach and assume that speakers are storing a &lt;em&gt;historical&lt;/em&gt; underlying form and then replaying millennia of sound changes.&lt;/p&gt;
&lt;p&gt;The problem here is there&#39;s no way a Koine speaker would have reconstructed &lt;em&gt;*osyo&lt;/em&gt; during acquisition. In my &lt;a href=&#34;/2015/11/12/analyzing-nominal-morphology-part-1/&#34;&gt;stem+ending annotations&lt;/a&gt; I tentatively used ο+ιο but I&#39;m reconsidering that. There is no evidence I can think of that would have helped a native Koine speaker choose between ο+ιο, ο+σο or ο+ο as underlying.&lt;/p&gt;
&lt;p&gt;And given that there are a class of 1st declension masculine nouns whose genitive singular ends in ου despite the α stem ending (which could not result in ου unless the α was actually dropped), it may actually be best to view the speakers&#39; knowledge as the ending just being &#34;ου&#34;— the naïve view we wrote off at the start.&lt;/p&gt;
&lt;p&gt;At the very least, we need to be very careful when saying &#34;the stem is X, the ending is Y&#34; as to whether we are trying to explain the form historically or the speakers&#39; synchronic knowledge.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">What is the genitive singular ending for 2nd declension nouns?</summary>
  </entry><entry>
    <title type="html">Full Citation Forms and Inflectional Classes</title>
    <link href="https://jktauber.com/2015/11/18/full-citation-forms-and-inflectional-classes/" rel="alternate" type="text/html" title="Full Citation Forms and Inflectional Classes"/>
    <published>2015-11-18</published>
    <updated>2015-11-18</updated>
    <id>https://jktauber.com/2015/11/18/full-citation-forms-and-inflectional-classes</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/18/full-citation-forms-and-inflectional-classes/">&lt;p&gt;Back in July and August 2014, I started looking at patterns in the full citation forms of nouns in Danker&#39;s Concise Lexicon. My goal was partly to explore, in a systematic way, the relationship between inflectional classes and the information expressed in the common pattern of &lt;code&gt;{nominative form}, {genitive ending}, {article}&lt;/code&gt;. I also wanted to put together a kind of automated test to catch typos and inconsistencies in the lexicon.&lt;/p&gt;
&lt;p&gt;I started drafting a paper with my findings as I went along and I intend to get back to it at some point but I wanted to mention this little project here, point to the code and mention a couple of things coming out of it so far.&lt;/p&gt;
&lt;p&gt;The code is available at &lt;a href=&#34;https://github.com/morphgnt/morphological-lexicon/tree/master/projects/citation_forms&#34;&gt;https://github.com/morphgnt/morphological-lexicon/tree/master/projects/citation_forms&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In particular, the file &lt;code&gt;citation_form_data.py&lt;/code&gt; contains the rules (still needing some work outside the basic &lt;code&gt;{nominative form}, {genitive ending}, {article}&lt;/code&gt; pattern) for what a full citation form can look like.&lt;/p&gt;
&lt;p&gt;Each row in this file contains a tuple of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a tuple of regexes matching the full citation form, Mounce&#39;s category and Dobson&#39;s part-of-speech/gender (the last mostly to catch errors in that file)&lt;/li&gt;
&lt;li&gt;a tentative new label for the inflectional class&lt;/li&gt;
&lt;li&gt;a (potentially empty) list of child rules&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;((r&amp;quot;α, ας, ἡ$&amp;quot;, r&amp;quot;^n-1a$&amp;quot;, r&amp;quot;^N:F$&amp;quot;), &amp;quot;1.1/a1/F&amp;quot;, []),
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;These rules are organized in a hierarchy starting with the most general rules and, containing as children, more specific subsets. The inflectional class labels like &lt;code&gt;1.1/a1/F&lt;/code&gt; are intended to reflect this hierarchy. For example, here are the ancestors of the above rule:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;((r&amp;quot;^(\w+), (\w+), (\w+)$&amp;quot;, r&amp;quot;^n-&amp;quot;, r&amp;quot;^N&amp;quot;), &amp;quot;&amp;quot;, [
    ((r&amp;quot;[αη]ς, {art}$&amp;quot;, r&amp;quot;^n-1&amp;quot;, r&amp;quot;^N:.$&amp;quot;), &amp;quot;1&amp;quot;, [
        ((r&amp;quot;ας, ἡ$&amp;quot;, r&amp;quot;^n-1&amp;quot;, r&amp;quot;^N:F$&amp;quot;), &amp;quot;1.1/F&amp;quot;, [
            ((r&amp;quot; ας, ἡ$&amp;quot;, r&amp;quot;^n-1&amp;quot;, r&amp;quot;^N:F$&amp;quot;), &amp;quot;1.1/F&amp;quot;, [
                ((r&amp;quot;α, ας, ἡ$&amp;quot;, r&amp;quot;^n-1[ah]$&amp;quot;, r&amp;quot;^N:F$&amp;quot;), &amp;quot;1.1/a/F&amp;quot;, [
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The first line is the most general rule for any nouns whose citation form in Danker has three parts. The next level (given the class &lt;code&gt;1&lt;/code&gt;) are those that have a citation form ending with either ας or ης and then an article. This is further subset (class &lt;code&gt;1.1/F&lt;/code&gt;) into citations forms ending with ας and a feminine singular article. This is further subset into citation forms with no other letters before ας in the genitive ending provided. This is further subset (class &lt;code&gt;1.1/a/F&lt;/code&gt;) into those citation form whose nominative form ends with α. Because this still results in a Mounce category of n-1a or n-1h, this is further refined into the first line we saw with the inflectional class &lt;code&gt;1.1/a1/F&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;From these rules certain inconsistencies show up. For example, &#34;γῆ, γῆς, ἡ&#34; is the only &#34;η, ης, ἡ&#34; entry that gives the full genitive form rather than just the genitive ending. Five of the six masculine words with genitive in &#34;τος&#34; give &#34;τος&#34; with the preceding vowel as the genitive ending but the other one gives the full genitive form. 34 feminine words with genitive in &#34;τος&#34; give just the preceding vowel but one gives the preceding consonant + vowel.&lt;/p&gt;
&lt;p&gt;For a lexicon whose editors want consistency in their citation forms, this kind of thing is useful to be able to check programmatially.&lt;/p&gt;
&lt;p&gt;Lots more to say when I get around to finishing the paper but I wanted to at least share the code and (in-progress) rules. For the tie-in to inflectional class modeling, I&#39;ll soon integrate this work with my recent work on &lt;a href=&#34;/2015/11/12/analyzing-nominal-morphology-part-1/&#34;&gt;Analyzing Nominal Morphology&lt;/a&gt; but I&#39;ll also use the &#34;automatic consistency checking&#34; aspect of the work to ensure better consistency in the &lt;em&gt;Morphological Lexicon&lt;/em&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Back in July and August 2014, I started looking at patterns in the full citation forms of nouns in Danker&#39;s Concise Lexicon. My goal was partly to explore, in a systematic way, the relationship between inflectional classes and the information expressed in the common pattern of &lt;code&gt;{nominative form}, {genitive ending}, {article}&lt;/code&gt;. I also wanted to put together a kind of automated test to catch typos and inconsistencies in the lexicon.</summary>
  </entry><entry>
    <title type="html">Modern Greek Text to Speech for Biblical Greek</title>
    <link href="https://jktauber.com/2015/11/17/modern-greek-text-speech-biblical-greek/" rel="alternate" type="text/html" title="Modern Greek Text to Speech for Biblical Greek"/>
    <published>2015-11-17</published>
    <updated>2015-11-17</updated>
    <id>https://jktauber.com/2015/11/17/modern-greek-text-speech-biblical-greek</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/17/modern-greek-text-speech-biblical-greek/">&lt;p&gt;Text-to-speech is pretty good these days but a lot of people don&#39;t realize that operating systems like OS X have support for languages other than English, including Modern Greek. So I thought I&#39;d experiment with using it to read the Greek New Testament.&lt;/p&gt;
&lt;p&gt;On OS X, if you go to System Preferences &amp;gt; Dictation and Speech, then select &#34;Customize...&#34; under System Voice, you can download or upgrade your Greek voices. There are a male and female voice you can try: Nikos and Melina respectively.&lt;/p&gt;
&lt;p&gt;There are two ways I know of that you can then get those voices to read Greek for you.&lt;/p&gt;
&lt;p&gt;The first way is, with Nikos or Melina selected as the System Voice, you select any Greek text in another app (such as TextEdit), right click and select Speech &amp;gt; Start Speaking. This will honour the speed setting in System Preferences &amp;gt; Dictation and Speech. Slowing down the speech drops quality dramatically, though.&lt;/p&gt;
&lt;p&gt;The second way is on the command line with &lt;code&gt;say&lt;/code&gt;. I can&#39;t work out if &lt;code&gt;say&lt;/code&gt; supports slowing down the reading (it doesn&#39;t honour the speed setting in System Preferences) but it does support outputting the result to an AIFF file.&lt;/p&gt;
&lt;p&gt;Note that you can&#39;t feed it polytonic Greek so you need to strip breathing and convert accents. I did that to produce a text like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Ήν δέ άνθρωπος εκ τών Φαρισαίων, Νικόδημος όνομα αυτώ, άρχων τών Ιουδαίων· ούτος ήλθεν πρός αυτόν νυκτός καί είπεν αυτώ· Ραββί, οίδαμεν ότι από θεού ελήλυθας διδάσκαλος· ουδείς γάρ δύναται ταύτα τά σημεία ποιείν ά σύ ποιείς, εάν μή ή ο θεός μετ’ αυτού. απεκρίθη Ιησούς καί είπεν αυτώ· Αμήν αμήν λέγω σοι, εάν μή τις γεννηθή άνωθεν, ου δύναται ιδείν τήν βασιλείαν τού θεού.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I then used&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;say -v Nikos -f john_3_1.txt -o john_3_1
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;to produce the following &lt;a href=&#34;/images/john_3_1.aiff&#34;&gt;AIFF file&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A pretty decent reading of the Greek New Testament with Modern Greek pronunciation.&lt;/p&gt;
&lt;p&gt;The only oddity is that the ου in the last clause is spelled out. Not sure how to fix that.&lt;/p&gt;
&lt;p&gt;What excites me about this is less the generation of long audio files of entire passages, but more how it could be used in conjunction with an intelligent tutor to pronounce individual words and phrases that the student is currently studying.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Text-to-speech is pretty good these days but a lot of people don&#39;t realize that operating systems like OS X have support for languages other than English, including Modern Greek. So I thought I&#39;d experiment with using it to read the Greek New Testament.</summary>
  </entry><entry>
    <title type="html">Actual Core Vocab Lists for Greek New Testament</title>
    <link href="https://jktauber.com/2015/11/16/actual-core-vocab-lists-greek-new-testament/" rel="alternate" type="text/html" title="Actual Core Vocab Lists for Greek New Testament"/>
    <published>2015-11-16</published>
    <updated>2015-11-16</updated>
    <id>https://jktauber.com/2015/11/16/actual-core-vocab-lists-greek-new-testament</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/16/actual-core-vocab-lists-greek-new-testament/">&lt;p&gt;Back in &lt;a href=&#34;/2015/10/30/core-vocabulary-new-testament-greek/&#34;&gt;The Core Vocabulary of New Testament Greek&lt;/a&gt; I talked about Wilfred Major&#39;s 2008 paper on core vocabulary lists for Classical Greek and provided code for producing the same for the Greek New Testament along with some discussion of the results. I didn&#39;t actually include the full results, however.&lt;/p&gt;
&lt;p&gt;Prompted by Paul-Nitz&#39;s &lt;a href=&#34;http://www.ibiblio.org/bgreek/forum/viewtopic.php?f=15&amp;amp;t=3418&amp;amp;p=22864#p22821&#34;&gt;request&lt;/a&gt; on the B-Greek forum, I put together &lt;a href=&#34;https://github.com/jtauber/core-gnt-vocab&#34;&gt;https://github.com/jtauber/core-gnt-vocab&lt;/a&gt; which includes not only the code but actually generated lists (currently 50% and 80% lemma lists).&lt;/p&gt;
&lt;p&gt;I&#39;ve included as a starting point glosses from Dodson but I&#39;d love people to file issues (or even better, pull requests) if they have improvements they&#39;d like to see.&lt;/p&gt;
&lt;p&gt;I&#39;m also interested if people think certain lexeme should be split like Major does (e.g. suppletive verbs).&lt;/p&gt;
&lt;p&gt;You can get the raw lists at:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://raw.githubusercontent.com/jtauber/core-gnt-vocab/master/lemma_50.txt&#34;&gt;50% List&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://raw.githubusercontent.com/jtauber/core-gnt-vocab/master/lemma_80.txt&#34;&gt;80% List&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Back in &lt;a href=&#34;/2015/10/30/core-vocabulary-new-testament-greek/&#34;&gt;The Core Vocabulary of New Testament Greek&lt;/a&gt; I talked about Wilfred Major&#39;s 2008 paper on core vocabulary lists for Classical Greek and provided code for producing the same for the Greek New Testament along with some discussion of the results. I didn&#39;t actually include the full results, however.</summary>
  </entry><entry>
    <title type="html">First Prototype of New Online Reader</title>
    <link href="https://jktauber.com/2015/11/15/first-prototype-new-online-reader/" rel="alternate" type="text/html" title="First Prototype of New Online Reader"/>
    <published>2015-11-15</published>
    <updated>2015-11-15</updated>
    <id>https://jktauber.com/2015/11/15/first-prototype-new-online-reader</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/15/first-prototype-new-online-reader/">&lt;p&gt;Over in the lab section of this site, I&#39;ve added a little prototype Patrick Altman and I built last night.&lt;/p&gt;
&lt;p&gt;At the moment it just shows the first paragraph of John 3 but if you click on a word it gives the lemmatization and parsing from MorphGNT, the gloss from Dodson and links to the head and child dependencies based on the GBI Syntax trees.&lt;/p&gt;
&lt;p&gt;You can try it out at &lt;a href=&#34;https://jktauber.com/labs/reader.html&#34;&gt;https://jktauber.com/labs/reader.html&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The source code is available at &lt;a href=&#34;https://github.com/morphgnt/reader-demo&#34;&gt;https://github.com/morphgnt/reader-demo&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Besides the obvious extention to the rest of the GNT text, I&#39;ll soon bring in information from the Morphological Lexicon to help readers understand &lt;em&gt;why&lt;/em&gt; the form is what it is.&lt;/p&gt;
&lt;p&gt;Longer term, I&#39;d like to add user accounts so authenticated users can bookmark passages, words and forms. Giving users the ability to mark which words they do or don&#39;t understand means that the site can then produce custom quizzes, recommend what to read next, etc.&lt;/p&gt;
&lt;p&gt;This is starting to get to the real heart of learning tools driven by better linguistic databases.&lt;/p&gt;
&lt;p&gt;If you&#39;re a Django and/or React developer who would like to help with this, let me know. If you teach intermediate students and have feedback on what would make this more useful, I&#39;d also love to hear from you.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Over in the lab section of this site, I&#39;ve added a little prototype Patrick Altman and I built last night.</summary>
  </entry><entry>
    <title type="html">Analyzing Nominal Morphology: Part 2</title>
    <link href="https://jktauber.com/2015/11/14/analyzing-nominal-morphology-part-2/" rel="alternate" type="text/html" title="Analyzing Nominal Morphology: Part 2"/>
    <published>2015-11-14</published>
    <updated>2015-11-14</updated>
    <id>https://jktauber.com/2015/11/14/analyzing-nominal-morphology-part-2</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/14/analyzing-nominal-morphology-part-2/">&lt;p&gt;In &lt;a href=&#34;/2015/11/12/analyzing-nominal-morphology-part-1/&#34;&gt;Analyzing Nominal Morphology: Part 1&lt;/a&gt;, I talked about putting together a list of nominal distinguishers and verifying it on the MorphGNT, generating a per-lexeme theme + distinguisher analysis. Here, I&#39;ll outline some further steps I&#39;ve taken.&lt;/p&gt;
&lt;p&gt;As well as producing a YAML file with entries for each lexeme, I also now generate a (space-delimited) tabular form that looks like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;ἀβαρής a-4a -- M n-3d(2aA) ἀβαρ AS ἀβαρῆ ἀβαρ ῆ εσ+α
ἄβυσσος n-2b -- F n-2b ἀβυσσ GS ἀβύσσου ἀβύσσ ου ο+ιο
ἄβυσσος n-2b -- F n-2b ἀβυσσ AS ἄβυσσον ἄβυσσ ον ο+ν
ἀγαθοποιέω verb PA M n=3c(5b-OU) ἀγαθοποι NS ἀγαθοποιῶν ἀγαθοποι ῶν ουντ+
ἀγαθοποιέω verb PA M n=3c(5b-OU) ἀγαθοποι NP ἀγαθοποιοῦντες ἀγαθοποι οῦντες ουντ+ες
ἀγαθοποιέω verb PA M n=3c(5b-OU) ἀγαθοποι AP ἀγαθοποιοῦντας ἀγαθοποι οῦντας ουντ+ας
ἀγαθοποιέω verb PA F n-1c ἀγαθοποιουσ NP ἀγαθοποιοῦσαι ἀγαθοποιοῦσ αι α+ι
ἀγαθοποιΐα n-1a -- F n-1a ἀγαθοποιϊ DS ἀγαθοποιΐᾳ ἀγαθοποιΐ ᾳ α+ι
ἀγαθοποιός a-3a -- M n-2a ἀγαθοποι GP ἀγαθοποιῶν ἀγαθοποι ῶν +ων
ἀγαθός a-1a(2a) -- M n-2a ἀγαθ NS ἀγαθός ἀγαθ ός ο+ς
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The columns are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;lemma&lt;/li&gt;
&lt;li&gt;Mounce category (or &lt;code&gt;verb&lt;/code&gt; for particples) for overall lexeme&lt;/li&gt;
&lt;li&gt;aspect / voice (for participles)&lt;/li&gt;
&lt;li&gt;gender&lt;/li&gt;
&lt;li&gt;Mounce category used for particular sub-paradigm (different from overall lexeme for adjectives or participles)&lt;/li&gt;
&lt;li&gt;lexeme-level theme&lt;/li&gt;
&lt;li&gt;case / number&lt;/li&gt;
&lt;li&gt;form&lt;/li&gt;
&lt;li&gt;form-specific theme&lt;/li&gt;
&lt;li&gt;form-specific distinguisher&lt;/li&gt;
&lt;li&gt;stem ending and suffix&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What&#39;s helpful about this format is you can use &lt;code&gt;awk&lt;/code&gt;, &lt;code&gt;grep&lt;/code&gt;, &lt;code&gt;sort&lt;/code&gt;, &lt;code&gt;wc&lt;/code&gt; and other Unix tools to very quickly extract information. (I may soon put it in SQL and expose a web interface too). So you can see all the times a particular distinguisher is used, or all the times it&#39;s used for a particular case / number. Or what all the sandhi rules are.&lt;/p&gt;
&lt;p&gt;I&#39;ve already written a Python script that generates a list of paradigms based on this (keyed off Mounce category for now, until I&#39;ve finalized my own, which will actually be defined &lt;em&gt;by&lt;/em&gt; these paradigms).&lt;/p&gt;
&lt;p&gt;The paradigms look like:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;n-3b(1) M (10):
    NS:   ξ          {κ+ς}
    GS:   κος        {κ+ος}
    DS:   κι         {κ+ι}
    AS:   κα         {κ+α}
    NP:   κες        {κ+ες}
    GP:   κων        {κ+ων}
    AP:   κας        {κ+ας}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There&#39;s actually a feedback loop where inconsistencies and errors spotted in this paradigm output inform corrections to the underlying distinguisher rules.&lt;/p&gt;
&lt;p&gt;The code and data are available at &lt;a href=&#34;https://github.com/morphgnt/morphological-lexicon/tree/master/projects/nominal_distinguishers&#34;&gt;https://github.com/morphgnt/morphological-lexicon/tree/master/projects/nominal_distinguishers&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In &lt;a href=&#34;/2015/11/12/analyzing-nominal-morphology-part-1/&#34;&gt;Analyzing Nominal Morphology: Part 1&lt;/a&gt;, I talked about putting together a list of nominal distinguishers and verifying it on the MorphGNT, generating a per-lexeme theme + distinguisher analysis. Here, I&#39;ll outline some further steps I&#39;ve taken.</summary>
  </entry><entry>
    <title type="html">Initial Thoughts on the Cost of Learning a Form</title>
    <link href="https://jktauber.com/2015/11/13/initial-thoughts-cost-learning-form/" rel="alternate" type="text/html" title="Initial Thoughts on the Cost of Learning a Form"/>
    <published>2015-11-13</published>
    <updated>2015-11-13</updated>
    <id>https://jktauber.com/2015/11/13/initial-thoughts-cost-learning-form</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/13/initial-thoughts-cost-learning-form/">&lt;p&gt;Over the years, when generating vocab coverage stats or orderings for graded readers, I&#39;ve used either lemmas or inflected forms as the items being learnt.&lt;/p&gt;
&lt;p&gt;The problem with using inflected forms is that it assumes knowing one form of a lexeme has nothing to do with knowing any other form of that lexeme. The problem with using lemmas is that it assumes knowing one form of a lexeme is enough to know all of them.&lt;/p&gt;
&lt;p&gt;Of course, the path forward lies somewhere in between and one of the motivations for all my &lt;em&gt;Morphological Lexicon&lt;/em&gt; work is to have the necessary data in machine-actionable form to take a much more intelligent approach to the relationship between knowing one form and knowing another.&lt;/p&gt;
&lt;p&gt;This gets in to some very deep areas of psycholinguistics and learnability but, for now, I&#39;m mostly just looking for a better measure of the &#34;cost&#34; or &#34;effort&#34; of learning a new form for the purposes of judging readability, etc. than just assuming all forms are equal or that learning a lemma gives you all the forms.&lt;/p&gt;
&lt;p&gt;An initial improvement could be made by using &lt;a href=&#34;/2015/11/03/distinguishers-morphology/&#34;&gt;themes and distinguishers&lt;/a&gt;. Consider λόγου, whose theme is λογ and distinguisher is ου. The theme identifies the lexeme (by definition it&#39;s the part of the word shared by all cells in a paradigm for a particular lexeme). The distinguisher both identifies some morphsyntactic properties (the fact it&#39;s a genitive singular, assuming we can tell it&#39;s a nominal) and gives some hints as to inflectional class (i.e. it reduces the possible distinguishers other cells in the paradigm can take).&lt;/p&gt;
&lt;p&gt;So a simple way of modeling things is to say that, in order to understand λόγου, you need to know λογ and ου. Breaking apart the themes and distinguishers is an improvement over just looking at lexemes or forms. Using the theme takes care of suppletive stems too. (Although it does raise the question: does learning that two suppletives stems are the same lexeme cost effort or save it?)&lt;/p&gt;
&lt;p&gt;There are a few situations that need more consideration though. Firstly stems that aren&#39;t truly suppletive but are systematically derived from one another. (e.g. λαμβαν / λαβ). To first approximation, you could just model this as full suppletion in terms of effort but a more refined approach would be to give a &#34;discount&#34; on the effort of learning λαμβαν if you already know λαβ or vice-versa. Even then, you&#39;d likely only want to provide that discount once learning the nu-infix pattern had been costed.&lt;/p&gt;
&lt;p&gt;Secondly, consider families of distinguishers for the same properties that differ because of sandhi (either in that particular cell or in others, causing the theme to have less of the stem). For example here are the 28 distinguishers for dative singular nominals according to my current analysis: -ᾳ, -αντι, -ατι, -γι, -δι, -ει, -ειρι, -ενι, -εντι, -ῃ,  -ι, -ιδι, -κι, -κτι, -νι, -ντι, -οϊ, -ονι, -οντι, -οτι, -ουντι, -πι, -ρι, -τι, -τῳ, -υϊ, -ῳ, -ωντι. The reason 28 are needed are because of sandhi in other cells such as the nominative singular. The only ending is -ι so you really only need to know that one thing (plus perhaps that iota is subscripted after a long alpha, eta or omega). The distinguisher analysis is still useful (particularly for its role in hinting at inflectional class) but the cost should be massively discounted once you recognize the -ι pattern.&lt;/p&gt;
&lt;p&gt;Thirdly, I haven&#39;t yet talked about costs and discounts for the actual sandhi rules. Should the -ους ending in the genitive singular (for stems ending in εσ or οσ) be discounted if you know both the genitive singular ending -ος and the εσ+ος → ους / οσ+ος → ους sandhi rules?&lt;/p&gt;
&lt;p&gt;And finally, while I&#39;ve talked a couple of times here about the distinguisher hinting at the inflectional class, that information hasn&#39;t been incorporated in to any costing or discounting in our discussions yet. It&#39;s worthy of a little more research into the psycholinguistics literature, but presumably seeing something like πίνακος primes you for recognizing πίναξ. It&#39;s also potentially useful for disambiguation: if you know the nominative plural ends in -ες, for example, then you know that -ος is a genitive singular not a nominative singular.&lt;/p&gt;
&lt;p&gt;There&#39;s clearly lots more to explore but it reinforces what I keep saying: having data like the distinguisher analysis opens us up to explore this sort of thing and potentially incorporate it in new learning tools.&lt;/p&gt;
&lt;p&gt;In this post, I&#39;ve just talked about morphology, but things can of course be extended (and &lt;em&gt;need&lt;/em&gt; to be extended) to constructions beyond the word. That, of course, requires richer analysis beyond what I&#39;m doing with the &lt;em&gt;Morphological Lexicon&lt;/em&gt; but that is something I eventually want to tackle as well.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Over the years, when generating vocab coverage stats or orderings for graded readers, I&#39;ve used either lemmas or inflected forms as the items being learnt.</summary>
  </entry><entry>
    <title type="html">Analyzing Nominal Morphology: Part 1</title>
    <link href="https://jktauber.com/2015/11/12/analyzing-nominal-morphology-part-1/" rel="alternate" type="text/html" title="Analyzing Nominal Morphology: Part 1"/>
    <published>2015-11-12</published>
    <updated>2015-11-12</updated>
    <id>https://jktauber.com/2015/11/12/analyzing-nominal-morphology-part-1</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/12/analyzing-nominal-morphology-part-1/">&lt;p&gt;While much of my work going back 10 years or more was on the nominals, the last few years I&#39;ve been focused on verbal morphology. I decided that for my SBL paper, however, I&#39;d revisit some of my noun work and ended up exploring some ideas afresh.&lt;/p&gt;
&lt;p&gt;By &lt;strong&gt;nominals&lt;/strong&gt; I mean nouns, adjectives, determiners, pronouns, proforms, participles. Basically anything marked for case (see &lt;a href=&#34;/2015/11/05/morphological-parts-speech-greek/&#34;&gt;Morphological Parts of Speech in Greek&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;I wanted to, at the very least, generate &lt;a href=&#34;/2015/11/03/distinguishers-morphology/&#34;&gt;themes and distinguishers&lt;/a&gt; for the nominals. But once you have that, you have a nice set up to explore stems, endings and sandhi. This is a nice interface into some of the general (i.e. not language-specific) morphology I was doing for my PhD. Finally, it enables me to get back to my long-running goal of laying out a system of inflectional classes that improves on Funk, Mounce and others.&lt;/p&gt;
&lt;p&gt;You can see the work in progress at &lt;a href=&#34;https://github.com/morphgnt/morphological-lexicon/tree/master/projects/nominal_distinguishers&#34;&gt;https://github.com/morphgnt/morphological-lexicon/tree/master/projects/nominal_distinguishers&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The first phase involved enumerating the possible distinguishers for each combination of case/number/gender. This was done incrementally, running a Python script that (a) showed me forms that weren&#39;t covered by the existing list; (b) showed me lexemes that had more than one theme. In some cases, multiple themes was a legitimate suppletion but in other cases it meant I hadn&#39;t gotten the theme/distinguisher split right. Because I had them in electronic form, I also used Mounce&#39;s inflectional classes as a hint to disambiguate distinguishers.&lt;/p&gt;
&lt;p&gt;So the first phase involved creating a file that looked something like this (just a very small subset of what is currently an 851-line file):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;NSM:
    - ας n-1d α+ς
    - ης n-1f η+ς
    - ος n-2a ο+ς
    - ψ n-3a\(1\) π+ς
    - ψ n-3a\(2\) β+ς
    - ξ n-3b\(1\) κ+ς
    - ξ n-3b\(2\) γ+ς
    - ξ n-3b\(3\) χ+ς
    - ους n=3c\(2-OD\) οδ+ς
    - ς n-3c\(1\) τ+ς
    - ς n-3c\(2\) δ+ς
    - ς n-3c\(3\) θ+ς
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You&#39;ll notice I annotated each distinguisher with the underlying stem ending and inflectional ending. You can see I needed to use Mounce&#39;s codes (for now) to disambiguate distinguishers like ψ, ξ and ς. You&#39;ll also notice I had to invent my own temporary extensions to Mounce in the case of οδ+ς → ους because there are deliberately no sandhi rules built in to my scripts (more on that later).&lt;/p&gt;
&lt;p&gt;My initial script takes the above file, runs across all forms in the MorphGNT SBLGNT are produces entries like the following:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;ἀγαλλίασις:
    forms:
        F:
            theme(s): ἀγαλλιασ
            NS: ἀγαλλίασις ἀγαλλίασ|ις ϳ+ς
            GS: ἀγαλλιάσεως ἀγαλλιάσ|εως ϳ+ος
            DS: ἀγαλλιάσει ἀγαλλιάσ|ει ϳ+ι
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In some (not necessarily immediately) following posts, I&#39;ll talk more about additional outputs and other scripts in the pipeline.&lt;/p&gt;
&lt;p&gt;This mini-project is a great example of where having a deterministic verification process on manually tweaked rules works well (over, say, trying to automate the generation of the rules entirely).&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">While much of my work going back 10 years or more was on the nominals, the last few years I&#39;ve been focused on verbal morphology. I decided that for my SBL paper, however, I&#39;d revisit some of my noun work and ended up exploring some ideas afresh.</summary>
  </entry><entry>
    <title type="html">Technical Aspects of Openness</title>
    <link href="https://jktauber.com/2015/11/11/technical-aspects-openness/" rel="alternate" type="text/html" title="Technical Aspects of Openness"/>
    <published>2015-11-11</published>
    <updated>2015-11-11</updated>
    <id>https://jktauber.com/2015/11/11/technical-aspects-openness</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/11/technical-aspects-openness/">&lt;p&gt;In my &lt;a href=&#34;/2015/11/10/why-i-use-cc-sa-licenses/&#34;&gt;previous post&lt;/a&gt;, I talked about the legal / licensing aspects of open linguistic data but there are technical aspects in order for linguistic data to be open too.&lt;/p&gt;
&lt;p&gt;To illustrate, consider an out-of-copyright, printed lexicon. From a &lt;em&gt;licensing&lt;/em&gt; point of view, it&#39;s open—it can be redistributed with or without modifications, etc. But that doesn&#39;t make it particularly usable for computational work.&lt;/p&gt;
&lt;p&gt;A while ago I came across something Greg Crane had written where he talked about things being &lt;strong&gt;machine-actionable&lt;/strong&gt;. I like this a lot more than &#34;machine-readable&#34; because it isn&#39;t just about being able to &#34;read&#34; the work, it but to actually do interesting things with it.&lt;/p&gt;
&lt;p&gt;There are various facets of this so I thought I&#39;d try to enumerate some of them.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;correctable&lt;/strong&gt; — can I make corrections if I find mistakes?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;verifiable&lt;/strong&gt; — can I write code to check for errors?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;reproducible&lt;/strong&gt; — can I reproduce the results others have found?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;extensible&lt;/strong&gt; — can I extend it with my own data or data from other sources?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;queryable&lt;/strong&gt; — can I search, filter, or sort the data to get subsets of interest?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;reusable&lt;/strong&gt; — can I use the same data for multiple applications?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;repurposable&lt;/strong&gt; — can I use the data for purposes not conceived of initially?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;adaptable&lt;/strong&gt; — can I produce different variants of the data applicable to different users?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;My &lt;a href=&#34;/2015/05/06/my-bibletech-2015-talk/&#34;&gt;BibleTech 2015&lt;/a&gt; talk touched on a number of these.&lt;/p&gt;
&lt;p&gt;I should note that it&#39;s entirely possible to have works that are proprietary from a licensing point of view but completely open technically. I may be able to purchase a database that I can&#39;t redistribute but which is in a clean, consistent format I can write software to process. It has the disadvantage that I can&#39;t make corrections available to others or redistribute derivative works, but it&#39;s better than a closed-license work that&#39;s also closed with regard to facets discussed in this post.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In my &lt;a href=&#34;/2015/11/10/why-i-use-cc-sa-licenses/&#34;&gt;previous post&lt;/a&gt;, I talked about the legal / licensing aspects of open linguistic data but there are technical aspects in order for linguistic data to be open too.</summary>
  </entry><entry>
    <title type="html">Why I Use CC-BY-SA Licenses</title>
    <link href="https://jktauber.com/2015/11/10/why-i-use-cc-sa-licenses/" rel="alternate" type="text/html" title="Why I Use CC-BY-SA Licenses"/>
    <published>2015-11-10</published>
    <updated>2015-11-10</updated>
    <id>https://jktauber.com/2015/11/10/why-i-use-cc-sa-licenses</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/10/why-i-use-cc-sa-licenses/">&lt;p&gt;I don&#39;t think I&#39;ve ever articulated why I favour a Creative Commons CC-BY-SA license on all my New Testament Greek data.&lt;/p&gt;
&lt;p&gt;I don&#39;t mean why do open scholarship in general, but why my specific choice of Attribution-ShareAlike?&lt;/p&gt;
&lt;p&gt;I avoid NoDerivs (&lt;strong&gt;ND&lt;/strong&gt;) because I &lt;em&gt;want&lt;/em&gt; people to build on my work, make corrections, add new analyses.&lt;/p&gt;
&lt;p&gt;I use ShareAlike (&lt;strong&gt;SA&lt;/strong&gt;), though, because I want to be able to incorporate corrections and new analyses back and want to avoid private forks of projects. Note that when it comes to software, I generally favour MIT/BSD-style licenses that aren&#39;t viral. But when it comes to data and analyses, I want the openness to be viral.&lt;/p&gt;
&lt;p&gt;Perhaps more controversially, I avoid NonCommercial (&lt;strong&gt;NC&lt;/strong&gt;). My reason is simple: I don&#39;t want someone who wants to use my work in a commercial package to have to waste time reinventing the wheel and redoing everything just so they can use it. Duplication of effort doesn&#39;t help anyone. Because of the ShareAlike, a commercial project can&#39;t make private forks. I don&#39;t care if someone is making money as long as improvements they make to my work are shared back.&lt;/p&gt;
&lt;p&gt;Creative Commons doesn&#39;t have a license that requires ShareAlike but not Attribution but, even if they did, I&#39;d use Attribution (&lt;strong&gt;BY&lt;/strong&gt;). Particularly in scholarship, I think it&#39;s important to give credit where credit is due. Plus having a chain of who did the work is useful for providing corrections upstream.&lt;/p&gt;
&lt;p&gt;My arguments for using ShareAlike and Attribution are why I don&#39;t like just putting things in the &#34;public domain&#34; / under a CC0 license. (Incidentally, I put &#34;public domain&#34; in quotes because it&#39;s an ill-defined concept, which is why the CC0 license was developed in the first place. Even if you&#39;re not persuaded by my arguments for BY-SA, at least use CC0 rather than saying &#34;public domain&#34;).&lt;/p&gt;
&lt;p&gt;Finally, I&#39;d be remiss if I didn&#39;t acknowledge the great work of the &lt;a href=&#34;http://creativecommons.org&#34;&gt;Creative Commons&lt;/a&gt; organization in making all this possible.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I don&#39;t think I&#39;ve ever articulated why I favour a Creative Commons CC-BY-SA license on all my New Testament Greek data.</summary>
  </entry><entry>
    <title type="html">Mean Log Frequency of Dependency Paths</title>
    <link href="https://jktauber.com/2015/11/09/mean-log-frequency-dependency-paths/" rel="alternate" type="text/html" title="Mean Log Frequency of Dependency Paths"/>
    <published>2015-11-09</published>
    <updated>2015-11-09</updated>
    <id>https://jktauber.com/2015/11/09/mean-log-frequency-dependency-paths</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/09/mean-log-frequency-dependency-paths/">&lt;p&gt;Adding another potential readability metric, let&#39;s look at the mean log frequency of dependency paths.&lt;/p&gt;
&lt;p&gt;So far we&#39;ve looked at the &lt;a href=&#34;/2015/10/27/mean-log-frequency-lexemes/&#34;&gt;mean log frequency of lexemes&lt;/a&gt;, the &lt;a href=&#34;/2015/11/04/mean-log-frequency-forms/&#34;&gt;mean log frequency of forms&lt;/a&gt;, and, after calculating &lt;a href=&#34;/2015/10/28/dependency-paths/&#34;&gt;dependency paths&lt;/a&gt; or &#34;swords&#34;, the &lt;a href=&#34;/2015/10/29/mean-dependency-depth/&#34;&gt;mean dependency depth&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;What we haven&#39;t looked at is the mean log frequency of those dependency paths—a rough proxy for a target having common (rather than merely shallow) syntactic structures.&lt;/p&gt;
&lt;p&gt;By this measure, the top five (i.e. lowest scoring) books are:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;4832 1 Corinthians
4929 3 John
4935 1 John
4938 John
5027 James
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and the top 10 chapters are:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;4183 1 Corinthians 13
4362 1 Corinthians 9
4386 1 Corinthians 14
4485 Romans 14
4486 John 16
4550 1 John 3
4558 2 Corinthians 11
4564 1 Corinthians 6
4566 1 Corinthians 7
4576 John 7
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It is interesting just how much 1 Corinthians features here. The book (and those chapters featured above) do poorly in terms of mean log frequency of lexemes.&lt;/p&gt;
&lt;p&gt;If 1 Corinthians is actually &lt;em&gt;syntactically&lt;/em&gt; easy to read, I wonder if that&#39;s an argument for having some readings which, because of vocab, need to be heavily footnoted with glosses but which are still worth reading early because of the syntax.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Adding another potential readability metric, let&#39;s look at the mean log frequency of dependency paths.</summary>
  </entry><entry>
    <title type="html">At the Half Way Point</title>
    <link href="https://jktauber.com/2015/11/08/half-way-point/" rel="alternate" type="text/html" title="At the Half Way Point"/>
    <published>2015-11-08</published>
    <updated>2015-11-08</updated>
    <id>https://jktauber.com/2015/11/08/half-way-point</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/08/half-way-point/">&lt;p&gt;Exactly two weeks ago I said I&#39;d be blogging every day until my talk at SBL. Well, that&#39;s two weeks away so I&#39;m at the half way point. I think the blogging has gone well.&lt;/p&gt;
&lt;p&gt;Many of the posts have been things I&#39;ve had drafts of for a while. Others have been ideas that haven&#39;t taken long to get down in a post. Attempting to blog every day means I haven&#39;t really worked on posts that represent multiple days much less weeks or months of work.&lt;/p&gt;
&lt;p&gt;In the next two weeks I do hope to talk about a few longer-running projects but, that said, I do enjoy getting down an idea or concept that&#39;s just a short post but which has been on my mind for years.&lt;/p&gt;
&lt;p&gt;Thanks to the people who have so far engaged with my posts via email and elsewhere. My interactions with you are a huge motivation for me doing this.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Exactly two weeks ago I said I&#39;d be blogging every day until my talk at SBL. Well, that&#39;s two weeks away so I&#39;m at the half way point. I think the blogging has gone well.</summary>
  </entry><entry>
    <title type="html">Generating Readers</title>
    <link href="https://jktauber.com/2015/11/07/generating-readers/" rel="alternate" type="text/html" title="Generating Readers"/>
    <published>2015-11-07</published>
    <updated>2015-11-07</updated>
    <id>https://jktauber.com/2015/11/07/generating-readers</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/07/generating-readers/">&lt;p&gt;Back in April 2014, Brian Renshaw posted a &lt;a href=&#34;http://www.brianrenshaw.com/blog/2014/4/18/a-good-friday-greek-reader-john-18-19&#34;&gt;Good Friday Greek Reader&lt;/a&gt;. It was presumably manually produced but I knew such things could be generated automatically and so went about building a system to do so.&lt;/p&gt;
&lt;p&gt;You can see a sample PDF at &lt;a href=&#34;https://github.com/jtauber/greek-reader/blob/master/example/reader.pdf&#34;&gt;https://github.com/jtauber/greek-reader/blob/master/example/reader.pdf&lt;/a&gt; which roughly looks like what Brian produced.&lt;/p&gt;
&lt;p&gt;From a code point of view, it&#39;s a fairly simple Python 3 script that generates LaTeX that is then typeset using XeTeX. There is also an experimental backend using SILE. The code is open source under an MIT license and is available at &lt;a href=&#34;https://github.com/jtauber/greek-reader&#34;&gt;https://github.com/jtauber/greek-reader&lt;/a&gt;. It assumes you&#39;re comfortable with those tools and editing text files to tweak things, but my hope is eventually a website could be built around this.&lt;/p&gt;
&lt;p&gt;To produce a reader like this, whether manually or automatically, you need:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;a text&lt;/li&gt;
&lt;li&gt;lemmatization&lt;/li&gt;
&lt;li&gt;frequency counts&lt;/li&gt;
&lt;li&gt;glosses&lt;/li&gt;
&lt;li&gt;full citation forms / headwords (e.g λαμπάς, άδος, ἡ) for nominals&lt;/li&gt;
&lt;li&gt;parsing (e.g. AAI 3S) for verbs&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;MorphGNT gave me 1, 2, 3 and 6. 4 came from Dodson (although you can override both globally and per verse) and 5 came from Danker&#39;s Concise Lexicon.&lt;/p&gt;
&lt;p&gt;What&#39;s nice about doing this programmatically, besides that fact you can make corrections upstream and have them applied to all the generated readers is that you can &lt;strong&gt;make this adaptive&lt;/strong&gt;. In the example, I chose which words to annotate based on frequency but it could just as easily be based on other criteria such as what a particular student has learnt up to this point or what has been covered in a particular textbook up to this point.&lt;/p&gt;
&lt;p&gt;One major feature I want to add, though, is richer annotation both morphologically AND syntactically so it becomes possible to generate something more akin to Zerwick and Gosvenor&#39;s &lt;em&gt;A Grammatical Analysis of the Greek New Testament&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;One major motivation for my continuing work on a &lt;em&gt;Morphological Lexicon&lt;/em&gt; is being able to provide more focused, helpful annotations for readers indicating not just a lemma but a principal part or some additional information that helps the student understand the form.&lt;/p&gt;
&lt;p&gt;For the syntax, I&#39;d like to eventually develop a catalog of constructions so, much like forms are only annotated if they are less frequent (or otherwise unknown to the student), particular syntactic constructions in a text can be called out based on similar criteria. Some of this is possible with existing syntactic analyses, the trick is knowing which annotations to include and which are already obvious. (I have some ideas for how to crowdsource difficult constructions, but more on that later).&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;greek-reader&lt;/strong&gt; project is a great example of a pretty simple tool that can do a lot because it builds on rich data. As we get better and better data, we can build better and better tools.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Back in April 2014, Brian Renshaw posted a &lt;a href=&#34;http://www.brianrenshaw.com/blog/2014/4/18/a-good-friday-greek-reader-john-18-19&#34;&gt;Good Friday Greek Reader&lt;/a&gt;. It was presumably manually produced but I knew such things could be generated automatically and so went about building a system to do so.</summary>
  </entry><entry>
    <title type="html">Inline Annotation of Sandhi</title>
    <link href="https://jktauber.com/2015/11/06/inline-annotation-sandhi/" rel="alternate" type="text/html" title="Inline Annotation of Sandhi"/>
    <published>2015-11-06</published>
    <updated>2015-11-06</updated>
    <id>https://jktauber.com/2015/11/06/inline-annotation-sandhi</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/06/inline-annotation-sandhi/">&lt;p&gt;In many Greek morphology projects, I&#39;ve wanted a way of conveying the surface form of an inflected word while also conveying the underlying components prior to the application of the sandhi rule. A couple of years ago, I came up with a simple representation for inline annotation.&lt;/p&gt;
&lt;p&gt;Say you want to convey the fact that φιλοῦμεν comes from φιλε + ομεν by application of the rule that ε + ο → ου. In the representation I&#39;ve been using you&#39;d write &lt;code&gt;φιλ|ε&amp;gt;ου&amp;lt;ο|μεν&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This enables you to see the stem and affix easily but also the result of sandhi.&lt;/p&gt;
&lt;p&gt;So what &lt;code&gt;A|B&amp;gt;C&amp;lt;D|E&lt;/code&gt; means is there is a sandhi rule that B + D → C and that rule has been applied in AB + DE to form ACE.&lt;/p&gt;
&lt;p&gt;Using Stump&#39;s terminology introduced in a &lt;a href=&#34;/2015/11/03/distinguishers-morphology/&#34;&gt;previous post&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A / φιλ is the &lt;strong&gt;theme&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;CE / ουμεν is the &lt;strong&gt;distinguisher&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;AB / φιλε is the &lt;strong&gt;stem&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;DE / ομεν is the &lt;strong&gt;affix&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It also means that you can search for &lt;code&gt;|B&amp;gt;C&amp;lt;D|&lt;/code&gt; to find where that particular sandhi rule has been applied.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In many Greek morphology projects, I&#39;ve wanted a way of conveying the surface form of an inflected word while also conveying the underlying components prior to the application of the sandhi rule. A couple of years ago, I came up with a simple representation for inline annotation.</summary>
  </entry><entry>
    <title type="html">Morphological Parts of Speech in Greek</title>
    <link href="https://jktauber.com/2015/11/05/morphological-parts-speech-greek/" rel="alternate" type="text/html" title="Morphological Parts of Speech in Greek"/>
    <published>2015-11-05</published>
    <updated>2015-11-05</updated>
    <id>https://jktauber.com/2015/11/05/morphological-parts-speech-greek</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/05/morphological-parts-speech-greek/">&lt;p&gt;The parts of speech in a particular language can be drawn up on the basis of syntactic properties, morphological properties, and/or (perhaps most problematically) semantic properties.&lt;/p&gt;
&lt;p&gt;What if we just want to classify lexemes in the MorphGNT based on what morphosynactic and morphosemantic features they have?&lt;/p&gt;
&lt;p&gt;Minimally, we might get something like this:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;case&lt;/th&gt;
&lt;th&gt;person&lt;/th&gt;
&lt;th&gt;aspect&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&lt;em&gt;conjunctions, adverbs, interjections, prepositions, particles, indeclinable nouns and adjectives&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&lt;em&gt;nouns, adjectives, pronouns, articles&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&lt;em&gt;infinitives&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&lt;em&gt;participles&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&lt;em&gt;finite verbs&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;We could consider voice, but it co-occurs with aspect, so its value is predictable.&lt;/p&gt;
&lt;p&gt;Mood only appears in finite verbs, which means it&#39;s also predictable (arguably, co-occurent with person but see below).&lt;/p&gt;
&lt;p&gt;Number is predictable as it co-occurs with case or person.&lt;/p&gt;
&lt;p&gt;As things stand above, gender is also predictable (it co-occurs with case).&lt;/p&gt;
&lt;p&gt;However, let&#39;s consider the distinction between the 1st/2nd person pronouns on the one hand and the proforms on the other.&lt;/p&gt;
&lt;p&gt;(There are strong arguments beyond just morphology for distinguishing the (1st/2nd person) personal pronouns and proforms. See Bhat&#39;s book &lt;em&gt;Pronouns&lt;/em&gt; for cross-linguistic arguments for the distinction.)&lt;/p&gt;
&lt;p&gt;The 1st/2nd person pronouns, unlike the proforms, don&#39;t inflect for gender. So let&#39;s add gender to the mix:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;case&lt;/th&gt;
&lt;th&gt;person&lt;/th&gt;
&lt;th&gt;gender&lt;/th&gt;
&lt;th&gt;aspect&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&amp;minus;&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &lt;big&gt;?&lt;/big&gt; under person for the personal pronouns is because they don&#39;t really &lt;em&gt;inflect&lt;/em&gt; for person. Person is lexical in the personal pronouns.&lt;/p&gt;
&lt;p&gt;Interestingly, though, if we &lt;em&gt;do&lt;/em&gt; give it a &lt;big&gt;+&lt;/big&gt; then we don&#39;t need gender to distinguish the category.&lt;/p&gt;
&lt;p&gt;You may wonder what about &lt;em&gt;degree&lt;/em&gt;. I&#39;m currently of the inclination that degree is better modeled derivationally rather than inflectionally, although that&#39;s worthy of a separate post.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">The parts of speech in a particular language can be drawn up on the basis of syntactic properties, morphological properties, and/or (perhaps most problematically) semantic properties.</summary>
  </entry><entry>
    <title type="html">Mean Log Frequency of Forms</title>
    <link href="https://jktauber.com/2015/11/04/mean-log-frequency-forms/" rel="alternate" type="text/html" title="Mean Log Frequency of Forms"/>
    <published>2015-11-04</published>
    <updated>2015-11-04</updated>
    <id>https://jktauber.com/2015/11/04/mean-log-frequency-forms</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/04/mean-log-frequency-forms/">&lt;p&gt;In &lt;a href=&#34;/2015/10/27/mean-log-frequency-lexemes/&#34;&gt;a previous post&lt;/a&gt;, we looked at which chapters had the highest mean log frequency of lexemes. The code provided there was applicable to other items, though, so let&#39;s now take a look at mean log frequency of &lt;strong&gt;forms&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The code change is a simple change to one line.&lt;/p&gt;
&lt;p&gt;The top 10 are:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;6277 2304 449
6373 2305 429
6500 2302 585
6558 0403 657
6562 2303 467
6596 1001 401
6600 0408 905
6617 2301 207
6640 0702 287
6646 2720 406
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In other words:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;1 John 4 (also 1st for lexemes)&lt;/li&gt;
&lt;li&gt;1 John 5 (also 2nd for lexemes)&lt;/li&gt;
&lt;li&gt;1 John 2 (8th for lexemes)&lt;/li&gt;
&lt;li&gt;John 3 (9th for lexemes)&lt;/li&gt;
&lt;li&gt;1 John 3 (7th for lexemes)&lt;/li&gt;
&lt;li&gt;Ephesians 1 (11th for lexemes)&lt;/li&gt;
&lt;li&gt;John 8 (6th for lexemes)&lt;/li&gt;
&lt;li&gt;1 John 1 (4th for lexemes)&lt;/li&gt;
&lt;li&gt;1 Corinthians 2 (32nd for lexemes)&lt;/li&gt;
&lt;li&gt;Revelation 20 (14th for lexemes)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Generally form frequency will track pretty closely with lexeme frequency because a form being common makes the lexeme common. This makes 1 Corithinians 2 interesting.&lt;/p&gt;
&lt;p&gt;Frequent words and forms obviously doesn&#39;t necessarily mean shallow syntax, though. 1 John 4, 5 and 2 are respectively the 36th 67th and 38th by mean dependency depth. There are no chapters that are in the top ten of both mean log form frequency AND mean dependency depth.&lt;/p&gt;
&lt;p&gt;So we now have mean log frequences for lexemes and forms as well as mean dependency depth. In future posts, I&#39;ll add parse codes and the actual dependency path to the mix and then we can look at combining all five metrics. I&#39;ll also look at paragraphs rather than chapters as targets.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In &lt;a href=&#34;/2015/10/27/mean-log-frequency-lexemes/&#34;&gt;a previous post&lt;/a&gt;, we looked at which chapters had the highest mean log frequency of lexemes. The code provided there was applicable to other items, though, so let&#39;s now take a look at mean log frequency of &lt;strong&gt;forms&lt;/strong&gt;.</summary>
  </entry><entry>
    <title type="html">Distinguishers in Morphology</title>
    <link href="https://jktauber.com/2015/11/03/distinguishers-morphology/" rel="alternate" type="text/html" title="Distinguishers in Morphology"/>
    <published>2015-11-03</published>
    <updated>2015-11-03</updated>
    <id>https://jktauber.com/2015/11/03/distinguishers-morphology</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/03/distinguishers-morphology/">&lt;p&gt;A few years ago, I was introduced by Greg Stump to the notion of &lt;strong&gt;distinguishers&lt;/strong&gt; in morphological description. The analysis of inflected forms in terms of theme + distinguisher is a very helpful concept and one that is made use extensively in my ongoing work on New Testament Greek morphology.&lt;/p&gt;
&lt;p&gt;Take a word like φιλοῦμεν. The underlying stem is φιλε and the suffix is ομεν. The sandhi rule ε + ο → ου has been applied.&lt;/p&gt;
&lt;p&gt;So in the surface form of the word, the φιλ is &lt;em&gt;part&lt;/em&gt; but not &lt;em&gt;all&lt;/em&gt; of the stem. It&#39;s the part that will likely (unless there is suppletion) be common with other cells in the paradigm. Similarly οῦμεν is not the suffix but it is the part that is indicating &#34;first person plural&#34; (as well as indicating that the stem likely ends in ε or ο).&lt;/p&gt;
&lt;p&gt;Stump calls φιλ the &lt;strong&gt;theme&lt;/strong&gt; and οῦμεν the &lt;strong&gt;distinguisher&lt;/strong&gt;. The &lt;strong&gt;theme&lt;/strong&gt; is what the cells in a paradigm have in common, the &lt;strong&gt;distinguisher&lt;/strong&gt; is what distinguishes them from one another.&lt;/p&gt;
&lt;p&gt;SPOILER ALERT: I&#39;m working on a full theme/distinguisher and stem/suffix analysis of every inflected form in the Greek New Testament as part of my &lt;em&gt;Morphological Lexicon of New Testament Greek&lt;/em&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A few years ago, I was introduced by Greg Stump to the notion of &lt;strong&gt;distinguishers&lt;/strong&gt; in morphological description. The analysis of inflected forms in terms of theme + distinguisher is a very helpful concept and one that is made use extensively in my ongoing work on New Testament Greek morphology.</summary>
  </entry><entry>
    <title type="html">Atom Editor 1.1 Fixes Polytonic Greek Bug</title>
    <link href="https://jktauber.com/2015/11/02/atom-editor-11-fixes-polytonic-greek-bug/" rel="alternate" type="text/html" title="Atom Editor 1.1 Fixes Polytonic Greek Bug"/>
    <published>2015-11-02</published>
    <updated>2015-11-02</updated>
    <id>https://jktauber.com/2015/11/02/atom-editor-11-fixes-polytonic-greek-bug</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/02/atom-editor-11-fixes-polytonic-greek-bug/">&lt;p&gt;Release 1.1 of GitHub&#39;s Atom Editor fixes a problem I had with using it for polytonic Greek.&lt;/p&gt;
&lt;p&gt;I was an early adoptor of &lt;a href=&#34;https://atom.io&#34;&gt;Atom Editor&lt;/a&gt; despite some initial rough edges. I now use it for all my development, including Greek-related stuff talked about on this blog—not just code but data files as well.&lt;/p&gt;
&lt;p&gt;Most of the rough edges got sorted out early on and certainly before the 1.0 release but there was one problem, highly relevant to this blog, that persisted.&lt;/p&gt;
&lt;p&gt;Basically, Atom was miscalculating the width of characters formed from Unicode combining characters which made it quite difficult to work with text files containing polytonic Greek.&lt;/p&gt;
&lt;p&gt;You can see the problem in this screenshot:&lt;/p&gt;
&lt;p&gt;&lt;img width=&#34;100%&#34; src=&#34;/images/atom_before.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;Notice that the existence of diacritics on the alpha at the end of some of the lines actually changes the width of preceding characters, even though a fixed-width font is being used. As well as just looking weird, it made files difficult to use as the cursor position didn&#39;t correspond visually to where typing would occur.&lt;/p&gt;
&lt;p&gt;I filed a &lt;a href=&#34;https://github.com/atom/atom/issues/5975&#34;&gt;bug report&lt;/a&gt; back in March and was disappointed a fix didn&#39;t make the Atom 1.0 release. But once I found out what was involved in fixing it (it didn&#39;t just affect polytonic Greek but a lot of non-ASCII use cases) I was impressed. If you want the raw details, see &lt;a href=&#34;https://github.com/atom/atom/pull/6083&#34;&gt;here&lt;/a&gt; and &lt;a href=&#34;https://github.com/atom/atom/pull/8811&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A couple of weeks ago Atom 1.1 came out and it includes all that work that (amongst other things) fixes the bug I filed.&lt;/p&gt;
&lt;p&gt;Now it works perfectly:&lt;/p&gt;
&lt;p&gt;&lt;img width=&#34;100%&#34; src=&#34;/images/atom_after.png&#34;&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Release 1.1 of GitHub&#39;s Atom Editor fixes a problem I had with using it for polytonic Greek.</summary>
  </entry><entry>
    <title type="html">Renaming Non-Indicative Tense-Forms</title>
    <link href="https://jktauber.com/2015/11/01/renaming-non-indicative-tense-forms/" rel="alternate" type="text/html" title="Renaming Non-Indicative Tense-Forms"/>
    <published>2015-11-01</published>
    <updated>2015-11-01</updated>
    <id>https://jktauber.com/2015/11/01/renaming-non-indicative-tense-forms</id>
    <content type="html" xml:base="https://jktauber.com/2015/11/01/renaming-non-indicative-tense-forms/">&lt;p&gt;I think it&#39;s confusing that we name the non-indicative tense-forms with the same terms as indicative tense-forms. For example “present indicative” and “present infinitive”. The word “present” doesn&#39;t mean the same thing in both cases.&lt;/p&gt;
&lt;p&gt;When there is a past/non-past alternation in Greek (e.g. imperfect/present or pluperfect/perfect), only one of the pair is possible in non-indicatives.&lt;/p&gt;
&lt;p&gt;The reason for this is simple: only the indicative mood makes a past/non-past distinction. In other cases, only aspect is conveyed.&lt;/p&gt;
&lt;p&gt;But this is undermined when we then go and choose for the non-indicative, &#34;aspect only&#34; forms the same terms that, in the indicative mood, are specifically conveying a non-past tense.&lt;/p&gt;
&lt;p&gt;It would be far better to use a term with the non-indicatives that conveys &lt;em&gt;only&lt;/em&gt; the aspect.&lt;/p&gt;
&lt;p&gt;&#34;Imperfective&#34; and &#34;perfective&#34; are obvious choices instead of &#34;present&#34; and &#34;aorist&#34; respectively (although it&#39;s not clear what we&#39;d use for the perfect or future non-indicatives).&lt;/p&gt;
&lt;p&gt;The same issue arises in discussion of &#34;systems&#34; and &#34;stems&#34;. Rather than the &#34;present system&#34; or the &#34;present stem&#34; should we instead talk about the &#34;imperfective system&#34; and &#34;imperfective stem&#34; in Greek?&lt;/p&gt;
&lt;p&gt;If we use &#34;perfective stem&#34; rather than &#34;aorist stem&#34; we avoid the asymmetry of talking about an augmented/un-augmented aorist stem but not (or at least not without some awkwardness) an augmented/un-augmented present stem. (One might be forgiven for thinking Greek involves a morphological process of &lt;em&gt;removing&lt;/em&gt; an augment if some descriptions of the aorist/perfective system are to be believed.)&lt;/p&gt;
&lt;p&gt;Of course even in the above, there is the confusing use of terminology for what to call the bundle of aspect and tense.&lt;/p&gt;
&lt;p&gt;Sometimes the bundles themselves are called &#34;tenses&#34; and the tense axis (as opposed to aspect) is referred to as &#34;time&#34;.&lt;/p&gt;
&lt;p&gt;Sometimes the bundles are called &#34;tense-forms&#34;, which I think is better but still slightly confusing as that should really be &#34;tense-aspect-forms&#34; or, perhaps, &#34;aspect-tense-forms&#34;.&lt;/p&gt;
&lt;p&gt;As an aside: the use of &#34;form&#34; is interesting as it places the bundling squarely in the realm of form, not meaning. In other words, even though the realization involves cumulative exponence (to adopt the terminology of Matthews), the meaning is just the union of the tense and aspect.&lt;/p&gt;
&lt;p&gt;All of this plays into morphological tagging as well. I&#39;ve suggested for the &lt;a href=&#34;https://github.com/morphgnt/sblgnt/wiki/Proposal-for-a-New-Tagging-Scheme&#34;&gt;rethink of the parse codes in MorphGNT 7&lt;/a&gt; that tense and aspect be split into two features.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I think it&#39;s confusing that we name the non-indicative tense-forms with the same terms as indicative tense-forms. For example “present indicative” and “present infinitive”. The word “present” doesn&#39;t mean the same thing in both cases.</summary>
  </entry><entry>
    <title type="html">An Experimental REST API to MorphGNT</title>
    <link href="https://jktauber.com/2015/10/31/experimental-rest-api-morphgnt/" rel="alternate" type="text/html" title="An Experimental REST API to MorphGNT"/>
    <published>2015-10-31</published>
    <updated>2015-10-31</updated>
    <id>https://jktauber.com/2015/10/31/experimental-rest-api-morphgnt</id>
    <content type="html" xml:base="https://jktauber.com/2015/10/31/experimental-rest-api-morphgnt/">&lt;p&gt;Back in July, I thought I&#39;d prototype a REST API for MorphGNT with resources for books, paragraphs, sentences, verses and words.&lt;/p&gt;
&lt;p&gt;The prototype is available on &lt;a href=&#34;http://api.morphgnt.org/&#34;&gt;http://api.morphgnt.org/&lt;/a&gt; and the underlying code &lt;a href=&#34;https://github.com/morphgnt/morphgnt-api&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The API exposes in JSON not only the normal MorphGNT data but also the paragraphs from the SBLGNT proper, the sentence divisions from the GBI syntax analysis AND the dependency relationships discussed in &lt;a href=&#34;/2015/07/02/converting-gbi-syntax-trees-dependency-analysis/&#34;&gt;Converting the GBI Syntax Trees to a Dependency Analysis&lt;/a&gt;. So for now, at least, it&#39;s the only place you can get all that info.&lt;/p&gt;
&lt;p&gt;The prototype is currently served up using Django hitting a PostgreSQL database but it would be possible to just generate the roughly 150,000 JSON files once and serve them up from a CDN.&lt;/p&gt;
&lt;p&gt;There&#39;s only one thing using the API that I know of at the moment and that&#39;s the &lt;a href=&#34;/labs/morphgnt-api-reader.html&#34;&gt;lab on this site&lt;/a&gt;. It doesn&#39;t make use of a lot of the rich word-level information but it does demo how you can navigate through paragraphs of the GNT purely using the links in a book&#39;s &lt;code&gt;first_paragraph&lt;/code&gt; or a paragraph&#39;s &lt;code&gt;prev&lt;/code&gt; and &lt;code&gt;next&lt;/code&gt; fields.&lt;/p&gt;
&lt;p&gt;Note that the &lt;code&gt;/v0/&lt;/code&gt; prefix is used in URLs because there is no commitment to keep this API. It is subject to rapid change at the moment.&lt;/p&gt;
&lt;p&gt;The URI patterns are:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;/v0/root.json
/v0/book/{osis_id}.json
/v0/paragraph/{paragraph_id}.json
/v0/sentence/{sentence_id}.json
/v0/verse/{verse_id}.json
/v0/word/{word_id}.json
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A word (currently) looks something like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;{
    @id: &amp;quot;/v0/word/64001001005.json&amp;quot;,
    @type: &amp;quot;word&amp;quot;,
    verse_id: &amp;quot;/v0/verse/640101.json&amp;quot;,
    sentence_id: &amp;quot;/v0/sentence/640001.json&amp;quot;,
    paragraph_id: &amp;quot;/v0/paragraph/64001.json&amp;quot;,
    crit_text: &amp;quot;λόγος,&amp;quot;,
    text: &amp;quot;λόγος,&amp;quot;,
    word: &amp;quot;λόγος&amp;quot;,
    norm: &amp;quot;λόγος&amp;quot;,
    lemma: &amp;quot;λόγος&amp;quot;,
    pos: &amp;quot;N&amp;quot;,
    case: &amp;quot;N&amp;quot;,
    number: &amp;quot;S&amp;quot;,
    gender: &amp;quot;M&amp;quot;,
    dep_type: &amp;quot;S&amp;quot;,
    head: &amp;quot;/v0/word/64001001002.json&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A verse (currently) looks something like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;{
    @id: &amp;quot;/v0/verse/640101.json&amp;quot;,
    @type: &amp;quot;verse&amp;quot;,
    prev: null,,
    next: &amp;quot;/v0/verse/640102.json&amp;quot;,
    book: &amp;quot;/v0/book/John.json&amp;quot;,
    words: [...]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;where &lt;code&gt;words&lt;/code&gt; is a list of objects like the word above.&lt;/p&gt;
&lt;p&gt;A paragraph and sentence are very similar to a verse (with an &lt;code&gt;@id&lt;/code&gt;, &lt;code&gt;@type&lt;/code&gt;,
&lt;code&gt;prev&lt;/code&gt;, &lt;code&gt;next&lt;/code&gt;, &lt;code&gt;book&lt;/code&gt; and &lt;code&gt;words&lt;/code&gt; list).&lt;/p&gt;
&lt;p&gt;A book (currently) looks something like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;{
    &amp;quot;@id&amp;quot;: &amp;quot;/v0/book/1Cor.json&amp;quot;,
    &amp;quot;@type&amp;quot;: &amp;quot;book&amp;quot;,
    &amp;quot;name&amp;quot;: &amp;quot;1 Corinthians&amp;quot;,
    root: &amp;quot;/v0/root.js&amp;quot;,
    &amp;quot;first_paragraph&amp;quot;: &amp;quot;/v0/paragraph/67001.json&amp;quot;,
    &amp;quot;first_verse&amp;quot;: &amp;quot;/v0/verse/670101.json&amp;quot;,
    &amp;quot;first_sentence&amp;quot;: &amp;quot;/v0/sentence/670001.json&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Feedback is greatly appreciated to make this more useful. I&#39;d particularly like to work with some front-end developers to do some more complex demos built on the API.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Back in July, I thought I&#39;d prototype a REST API for MorphGNT with resources for books, paragraphs, sentences, verses and words.</summary>
  </entry><entry>
    <title type="html">The Core Vocabulary of New Testament Greek</title>
    <link href="https://jktauber.com/2015/10/30/core-vocabulary-new-testament-greek/" rel="alternate" type="text/html" title="The Core Vocabulary of New Testament Greek"/>
    <published>2015-10-30</published>
    <updated>2015-10-30</updated>
    <id>https://jktauber.com/2015/10/30/core-vocabulary-new-testament-greek</id>
    <content type="html" xml:base="https://jktauber.com/2015/10/30/core-vocabulary-new-testament-greek/">&lt;p&gt;In a 2008 paper, Wilfred Major constructs what he calls the 50% and 80% vocab lists for Classical Greek. That is, the lemmata that account for 50% and 80% respectively of tokens in the Classical Greek corpus. In this post I provide the code for the equivalent for the Greek New Testament and talk about some of the results.&lt;/p&gt;
&lt;p&gt;Major&#39;s paper is &lt;a href=&#34;https://camws.org/cpl/cplonline/files/Majorcplonline.pdf&#34;&gt;It’s Not the Size, It’s the Frequency: The Value of Using a Core Vocabulary in Beginning and Intermediate Greek&lt;/a&gt; and as well as listing the 65 words in the &#34;50% List&#34; he lists the roughly 1,100 words in the &#34;80% List&#34; complete with glosses in both cases.&lt;/p&gt;
&lt;p&gt;Major also discusses other issues near and dear to this blog such as the relevance of form frequency as well as lemma frequency. I&#39;ll respond to him on some of these topics in later blog posts.&lt;/p&gt;
&lt;p&gt;Now, for many years I&#39;ve talked about the limitations of a purely frequency-based approach to vocab ordering but that doesn&#39;t mean producing such lists is useless, just that there are things we can do to improve on that approach. But I still thought it would be interesting to produce GNT 50% and 80% lists.&lt;/p&gt;
&lt;p&gt;The code is available &lt;a href=&#34;https://gist.github.com/jtauber/d05bbe3ee9536bf59147&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The 50% list consists of just 27 lemmata. The only verbs are γίνομαι, εἰμί, ἔχω, and λέγω. The only nouns are θεός, κύριος, and  Ἰησοῦς.&lt;/p&gt;
&lt;p&gt;The 80% list consists of 317 lemmata.&lt;/p&gt;
&lt;p&gt;As expected, this is considerably smaller than Major&#39;s Classical Greek lists which are based on a considerably larger corpus.&lt;/p&gt;
&lt;p&gt;It&#39;s easy to tweak the code to look at forms rather than lemmata. The 50% &lt;em&gt;forms&lt;/em&gt; list for the GNT consists of 97 forms from 52 lemmata.&lt;/p&gt;
&lt;p&gt;Interestingly, those 97 forms consist of 16 forms of the article, 15 forms of the (1st/2nd person) personal pronouns, and 6 forms of αὐτός. This suggests that even without arguments on morphological grounds, it&#39;s worth learning the full paradigms for the article, the personal pronouns and αὐτός really early on.&lt;/p&gt;
&lt;p&gt;Unsurprisingly, λέγω gets a decent showing with 4 forms: εἶπεν, λέγει, λέγω and λέγων. I&#39;ve long though it&#39;s worth learning those right away without needing to introduce full paradigms.&lt;/p&gt;
&lt;p&gt;There&#39;s a lot more that could be explored even with this frequency-based approach. And lots more to say based on the other things Major talks about in his paper.&lt;/p&gt;
&lt;p&gt;Finally, it should be stressed that very few full verses of the GNT would be readable with just the 80% list and probably none with the 50% list. I may do another post later on to confirm that.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: Now see &lt;a href=&#34;/2015/11/16/actual-core-vocab-lists-greek-new-testament/&#34;&gt;Actual Core Vocab Lists for Greek New Testament&lt;/a&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In a 2008 paper, Wilfred Major constructs what he calls the 50% and 80% vocab lists for Classical Greek. That is, the lemmata that account for 50% and 80% respectively of tokens in the Classical Greek corpus. In this post I provide the code for the equivalent for the Greek New Testament and talk about some of the results.</summary>
  </entry><entry>
    <title type="html">Mean Dependency Depth</title>
    <link href="https://jktauber.com/2015/10/29/mean-dependency-depth/" rel="alternate" type="text/html" title="Mean Dependency Depth"/>
    <published>2015-10-29</published>
    <updated>2015-10-29</updated>
    <id>https://jktauber.com/2015/10/29/mean-dependency-depth</id>
    <content type="html" xml:base="https://jktauber.com/2015/10/29/mean-dependency-depth/">&lt;p&gt;With dependency paths calculated for the Greek New Testament, we can use mean dependency depth as a proxy for syntactic complexity.&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/2015/10/27/mean-log-frequency-lexemes/&#34;&gt;Mean Log Frequency of Lexemes&lt;/a&gt; I mentioned that, as well as mean log word frequency, reading comprehension measures such as the Lexile® framework use average sentence length. Now that we have &lt;a href=&#34;/2015/10/28/dependency-paths/&#34;&gt;Dependency Paths&lt;/a&gt; calculated, we can explore potentially more useful proxies for syntactic complexity.&lt;/p&gt;
&lt;p&gt;As an initial experiment, we&#39;ll simply take the mean dependency depth of each target where our targets are chapters and by &#34;dependency depth&#34; I simply mean the number of labels in the dependency path. In other words &lt;code&gt;np-O-CL-CL&lt;/code&gt; will count as 4 and we&#39;ll just average across all the words in each chapter.&lt;/p&gt;
&lt;p&gt;An initial run reveals one interesting problem. Luke 3 is given a considerably higher score than anything else because of the analysis of the genealogy (A the son of B the son of C...and so on, leads to very long paths). Reading that genealogy is arguably not that taxing syntactically which highlights one flaw in the dependency depth approach (or, perhaps the analysis chosen for the genealogy).&lt;/p&gt;
&lt;p&gt;This aside, let&#39;s look at what this measure identifies as easiest chapters:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;2685 67009
2715 67006
2746 66014
2831 67014
2840 66013
2840 69005
2841 67007
2869 66007
2888 67016
2892 69003
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Interestingly, the top 10 chapters for lowest mean dependency depth are all in Romans, 1 Corinthians and Galatians.&lt;/p&gt;
&lt;p&gt;If we average, instead, across entire books, the top ten are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;3 John&lt;/li&gt;
&lt;li&gt;1 Corinthians&lt;/li&gt;
&lt;li&gt;1 John&lt;/li&gt;
&lt;li&gt;James&lt;/li&gt;
&lt;li&gt;Galatians&lt;/li&gt;
&lt;li&gt;John&lt;/li&gt;
&lt;li&gt;Romans&lt;/li&gt;
&lt;li&gt;Matthew&lt;/li&gt;
&lt;li&gt;Mark&lt;/li&gt;
&lt;li&gt;2 John&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;which is perhaps a little less surprising.&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;hardest&lt;/em&gt; chapters, Luke 3 aside, are the first chapters of Ephesians, 2 Timothy and Colossians, which probably isn&#39;t much of a surprise either. The hardest books overall are Ephesians and Colossians.&lt;/p&gt;
&lt;p&gt;The code is available &lt;a href=&#34;https://gist.github.com/jtauber/16631ec63e6657f9a423&#34;&gt;here&lt;/a&gt; (tweak line 13 to get book-level stats).&lt;/p&gt;
&lt;p&gt;Note, this all may be quite sensitive to the choice of analysis. It would be an interesting exercise to see, for example, what the PROIEL dependency analysis yields.&lt;/p&gt;
&lt;p&gt;In future posts, we&#39;ll try a few more measures and then try to bring them together to see how chapters (or books, or authors) compare across multiple criteria.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">With dependency paths calculated for the Greek New Testament, we can use mean dependency depth as a proxy for syntactic complexity.</summary>
  </entry><entry>
    <title type="html">Dependency Paths</title>
    <link href="https://jktauber.com/2015/10/28/dependency-paths/" rel="alternate" type="text/html" title="Dependency Paths"/>
    <published>2015-10-28</published>
    <updated>2015-10-28</updated>
    <id>https://jktauber.com/2015/10/28/dependency-paths</id>
    <content type="html" xml:base="https://jktauber.com/2015/10/28/dependency-paths/">&lt;p&gt;For numerous corpus linguistics applications, it&#39;s useful to have a word-level indication of syntax. A presentation by Vanessa and Robert Gorman gave me the idea of using dependency paths for this purpose so I&#39;ve now calculated them for the GNT based on the GBI syntax trees.&lt;/p&gt;
&lt;p&gt;The presentation by the Gormans was entitled &lt;a href=&#34;http://sites.tufts.edu/perseusupdates/events/dcne/greek-historiography-through-dependency-syntax-treebanking/&#34;&gt;Greek Historiography Through Dependency Syntax Treebanking&lt;/a&gt; and they refer to the dependency paths as &#34;syntactic words&#34; or &#34;swords&#34; for short.&lt;/p&gt;
&lt;p&gt;While their particular interest is authorship, the Gormans make an excellent point about the value of these dependency paths:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The chief advantage of recasting dependencies as syntax words is that they are immediately valuable: with trivial modifications such texts can be put into standard text-processing software to produce type-token ratios, word frequency histograms, etc., providing detailed syntactic information about individual authors.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I&#39;ve previously written about &lt;a href=&#34;/2015/07/02/converting-gbi-syntax-trees-dependency-analysis/&#34;&gt;Converting the GBI Syntax Trees to a Dependency Analysis&lt;/a&gt; so it&#39;s just a small step to producing dependency paths.&lt;/p&gt;
&lt;p&gt;So if we take the output for the first part of John 3.16 from this dependency conversion:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;64003016001 Οὕτως 64003016003 ADV
64003016002 γὰρ 64003016003 conj
64003016003 ἠγάπησεν None CL
64003016004 ὁ 64003016005 det
64003016005 θεὸς 64003016003 S
64003016006 τὸν 64003016007 det
64003016007 κόσμον 64003016003 O
64003016008 ὥστε 64003016013 conj
64003016009 τὸν 64003016010 det
64003016010 υἱὸν 64003016013 O
64003016011 τὸν 64003016012 det
64003016012 μονογενῆ 64003016010 np
64003016013 ἔδωκεν, 64003016003 CL
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;we can easily build up the dependency paths / swords:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;64003016001 Οὕτως ADV-CL
64003016002 γὰρ conj-CL
64003016003 ἠγάπησεν CL
64003016004 ὁ det-S-CL
64003016005 θεὸς S-CL
64003016006 τὸν det-O-CL
64003016007 κόσμον O-CL
64003016008 ὥστε conj-CL-CL
64003016009 τὸν det-O-CL-CL
64003016010 υἱὸν O-CL-CL
64003016011 τὸν det-np-O-CL-CL
64003016012 μονογενῆ np-O-CL-CL
64003016013 ἔδωκεν, CL-CL
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So it will tell you that μονογενῆ is qualifying the object of a subordinate clause (at least according to the GBI analysis). We&#39;ve thrown away the noun it&#39;s modifying (υἱὸν) and the verb in the subordinate clause it&#39;s the object of (ἔδωκεν) and the verb in the main clause (ἠγάπησεν), but &lt;code&gt;np-O-CL-CL&lt;/code&gt; is a decent label for its syntactic role as qualifying the object of a subordinate clause.&lt;/p&gt;
&lt;p&gt;The code I used is available &lt;a href=&#34;https://gist.github.com/jtauber/676c7030d9b56f3e6acf&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">For numerous corpus linguistics applications, it&#39;s useful to have a word-level indication of syntax. A presentation by Vanessa and Robert Gorman gave me the idea of using dependency paths for this purpose so I&#39;ve now calculated them for the GNT based on the GBI syntax trees.</summary>
  </entry><entry>
    <title type="html">Mean Log Frequency of Lexemes</title>
    <link href="https://jktauber.com/2015/10/27/mean-log-frequency-lexemes/" rel="alternate" type="text/html" title="Mean Log Frequency of Lexemes"/>
    <published>2015-10-27</published>
    <updated>2015-10-27</updated>
    <id>https://jktauber.com/2015/10/27/mean-log-frequency-lexemes</id>
    <content type="html" xml:base="https://jktauber.com/2015/10/27/mean-log-frequency-lexemes/">&lt;p&gt;One component of many readability measures on texts is the mean log word frequency. Here I do a basic calculation across chapters in the Greek New Testament (with code provided).&lt;/p&gt;
&lt;p&gt;Usually, the mean log word frequency is used in conjunction with something like the log mean sentence length (for example in the Lexile® framework). The latter is used as a proxy for syntactic complexity but, having a syntactic analysis, I think we can do better and I&#39;ll explore that in a future post.&lt;/p&gt;
&lt;p&gt;For now, though, I wanted to get a per-chapter measure just based on mean log frequency of lexemes.&lt;/p&gt;
&lt;p&gt;The code is available &lt;a href=&#34;https://gist.github.com/jtauber/8e9156b34f452ea4cd89&#34;&gt;here&lt;/a&gt;. It&#39;s easy to adjust the targets (by default chapters, specified on line 14) and the items (by default lexemes, specified on line 15).&lt;/p&gt;
&lt;p&gt;The result of running the script is something like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;6153 0101 436
5757 0102 457
5471 0103 331
5487 0104 428
5437 0105 821
5532 0106 648
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;where the first column is -1000 times the mean log frequency (so the higher, the harder to read), the second column is the book and chapter number and the third column is just the number of word tokens in that chapter.&lt;/p&gt;
&lt;p&gt;If we sort this output, we should get a list of the easiest chapters to read (at least by the measure of mean log lexeme frequency):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;4704 2304 449
4746 2305 429
4926 0417 498
4949 2301 207
4973 0414 577
5025 0408 905
5036 2303 467
5044 2302 585
5080 0403 657
5090 2710 291
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It is perhaps not surprising that the easiest chapters are from 1John and John&#39;s gospel (with Rev 10 coming it at number 10).&lt;/p&gt;
&lt;p&gt;It will be interesting to see if we get similar results once we factor in some measure of syntactic complexity.&lt;/p&gt;
&lt;p&gt;Incidentally, the most difficult chapter to read based on mean log lexeme frequency is 2 Peter 2 although 1 Timothy and Titus feature quite a bit in the most difficult ten chapters as well.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">One component of many readability measures on texts is the mean log word frequency. Here I do a basic calculation across chapters in the Greek New Testament (with code provided).</summary>
  </entry><entry>
    <title type="html">Updated Vocabulary Coverage Statistics</title>
    <link href="https://jktauber.com/2015/10/26/updated-vocabulary-coverage-statistics/" rel="alternate" type="text/html" title="Updated Vocabulary Coverage Statistics"/>
    <published>2015-10-26</published>
    <updated>2015-10-26</updated>
    <id>https://jktauber.com/2015/10/26/updated-vocabulary-coverage-statistics</id>
    <content type="html" xml:base="https://jktauber.com/2015/10/26/updated-vocabulary-coverage-statistics/">&lt;p&gt;In various mailing list posts, blog posts and talks, I&#39;ve shown vocabulary coverage statistics. It&#39;s time to update the code to use more recent data and republish the results here.&lt;/p&gt;
&lt;p&gt;The vocabulary coverage tables have a number of different parameters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;what are the items being learnt: lexemes or forms or something else?&lt;/li&gt;
&lt;li&gt;what are the targets: verses or sentences or something else?&lt;/li&gt;
&lt;li&gt;what ordering is being used: item frequency or something else?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and, of course, what text and lemmatization is being used.&lt;/p&gt;
&lt;p&gt;Most of my published stats before were based on the UBS3 version of MorphGNT. Here I&#39;m going to use the latest MorphGNT based on the SBLGNT (MorphGNT 6.06) and I&#39;m going to explore not just verses but (in followup posts) clauses and sentences from the GBI Syntax Trees and paragraphs from the SBLGNT.&lt;/p&gt;
&lt;p&gt;I also want to start incorporating the information from my morphological lexicon into the item/target modeling and ordering algorithms.&lt;/p&gt;
&lt;p&gt;But first let&#39;s just update the basic stats.&lt;/p&gt;
&lt;h2&gt;Verses-Lexemes with Frequency Ordering&lt;/h2&gt;
&lt;p&gt;A target-item file for verses-lexemes can be achieved with:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;awk &#39;{print $1,$7}&#39; sblgnt/*-morphgnt.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;if we then feed that to &lt;a href=&#34;https://github.com/jtauber/graded-reader/blob/cf9f59ca3695d4d832208ef402373a8e08f57da0/code/vocab-coverage.py&#34;&gt;vocab-coverage.py&lt;/a&gt; we get the following result:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;             ANY    50.00%    75.00%    90.00%    95.00%   100.00%
------------------------------------------------------------------
   100    99.91%    91.07%    24.36%     2.13%     0.64%     0.48%
   200    99.92%    96.83%    51.80%     9.75%     3.43%     2.54%
   500    99.97%    99.13%    82.23%    36.57%    17.81%    13.81%
  1000    99.99%    99.71%    93.60%    62.57%    37.28%    29.99%
  2000   100.00%    99.92%    98.41%    84.95%    65.38%    56.43%
  5000   100.00%   100.00%   100.00%    99.51%    96.44%    94.58%
   ALL   100.00%   100.00%   100.00%   100.00%   100.00%   100.00%
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What this table is saying is that if you learn, say, the 200 most frequent lexemes, you&#39;ll be able to read 95% of the lexemes in 3.43% of verses.&lt;/p&gt;
&lt;h2&gt;Verses-Forms with Frequency Ordering&lt;/h2&gt;
&lt;p&gt;A target-item file for verses-forms can be achieved with:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;awk &#39;{print $1,$6}&#39; sblgnt/*-morphgnt.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;if we then feed that to &lt;code&gt;vocab-coverage.py&lt;/code&gt; but with 10000 added as an item count, we get the following result:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;             ANY    50.00%    75.00%    90.00%    95.00%   100.00%
------------------------------------------------------------------
   100    99.82%    57.63%     1.10%     0.04%     0.01%     0.01%
   200    99.86%    78.86%     6.51%     0.34%     0.05%     0.05%
   500    99.91%    92.85%    26.95%     2.23%     0.59%     0.52%
  1000    99.94%    96.95%    51.23%     7.75%     2.31%     1.74%
  2000    99.96%    98.65%    72.52%    21.74%     7.86%     5.80%
  5000    99.97%    99.74%    90.97%    52.13%    28.52%    21.61%
 10000   100.00%    99.94%    98.31%    78.28%    55.19%    45.28%
   ALL   100.00%   100.00%   100.00%   100.00%   100.00%   100.00%
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What this table is saying is that if you learn, say, the 500 most frequent forms, you&#39;ll be able to read 75% of the forms in 26.95% of verses.&lt;/p&gt;
&lt;p&gt;Various talks, including those at BibleTech in 2010 and 2015 explain a ton of caveats around these numbers but I wanted to at least refresh them (and then code) with the latest data.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">In various mailing list posts, blog posts and talks, I&#39;ve shown vocabulary coverage statistics. It&#39;s time to update the code to use more recent data and republish the results here.</summary>
  </entry><entry>
    <title type="html">Blogging Every Day Between Now and SBL Annual Meeting</title>
    <link href="https://jktauber.com/2015/10/25/blogging-every-day-between-now-sbl-annual-meeting/" rel="alternate" type="text/html" title="Blogging Every Day Between Now and SBL Annual Meeting"/>
    <published>2015-10-25</published>
    <updated>2015-10-25</updated>
    <id>https://jktauber.com/2015/10/25/blogging-every-day-between-now-sbl-annual-meeting</id>
    <content type="html" xml:base="https://jktauber.com/2015/10/25/blogging-every-day-between-now-sbl-annual-meeting/">&lt;p&gt;It&#39;s exactly four weeks until I&#39;m presenting at the SBL Annual Meeting in Atlanta. As I have a long backlog of posts I&#39;ve wanted to do for a while, I thought I might try to blog every day between now and my talk on November 22nd.&lt;/p&gt;
&lt;p&gt;As well as motivating me to finish up some posts and also get some other ideas down in writing, I also hope the blogging will get people more interested in what I&#39;m going to be talking about at the SBL meeting and lay a foundation for some conversations I hope to have with people while there.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">It&#39;s exactly four weeks until I&#39;m presenting at the SBL Annual Meeting in Atlanta. As I have a long backlog of posts I&#39;ve wanted to do for a while, I thought I might try to blog every day between now and my talk on November 22nd.</summary>
  </entry><entry>
    <title type="html">Speaking At The SBL Annual Meeting in Atlanta</title>
    <link href="https://jktauber.com/2015/07/15/speaking-sbl-annual-meeting-atlanta/" rel="alternate" type="text/html" title="Speaking At The SBL Annual Meeting in Atlanta"/>
    <published>2015-07-15</published>
    <updated>2015-07-15</updated>
    <id>https://jktauber.com/2015/07/15/speaking-sbl-annual-meeting-atlanta</id>
    <content type="html" xml:base="https://jktauber.com/2015/07/15/speaking-sbl-annual-meeting-atlanta/">&lt;p&gt;I&#39;ve just finished up registration for the SBL Annual Meeting. Here&#39;s the paper I&#39;ll be presenting.&lt;/p&gt;
&lt;h2&gt;A Morphological Lexicon of New Testament Greek&lt;/h2&gt;
&lt;p&gt;Morphological analyses such as analytical lexicons have typically involved indicating lemma, part-of-speech, morphosyntactic and morphosemantic information (such as case, number, person, gender, tense, voice, mood and degree). Much progress has been made in recent years making analyses of this sort freely available in digital formats, but the kind of information they contain has not advanced significantly for decades. This paper will provide an overview of the work of the MorphGNT project to develop an electronic Morphological Lexicon of New Testament Greek that adds inflectional classes, roots and stems, stem formation and morphophonological processes, principal parts, and derivational morphology. Beyond serving as a database of linguistic information, the goal of the morphological lexicon is to provide an &#34;executable grammar&#34; so particular grammar points discussed in beginner grammars, intermediate grammars or advanced reference grammars can be tested against a corpus in a way that makes completely transparent where the &#34;rules&#34; are followed and where they fall down. This data also provides useful data for pedagogical tools such as intelligent tutoring systems that typically require better modeling of latent traits in order to determine what a student actually knows and what items best test that knowledge. All data is for the Morphological Lexicon of New Testament Greek is available under a Creative Commons license, and all code used for both the generation and verification of the morphological lexicon is open source.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve just finished up registration for the SBL Annual Meeting. Here&#39;s the paper I&#39;ll be presenting.</summary>
  </entry><entry>
    <title type="html">Types of Disagreement in Syntactic Analyses</title>
    <link href="https://jktauber.com/2015/07/13/types-disagreement-syntactic-analyses/" rel="alternate" type="text/html" title="Types of Disagreement in Syntactic Analyses"/>
    <published>2015-07-13</published>
    <updated>2015-07-13</updated>
    <id>https://jktauber.com/2015/07/13/types-disagreement-syntactic-analyses</id>
    <content type="html" xml:base="https://jktauber.com/2015/07/13/types-disagreement-syntactic-analyses/">&lt;p&gt;As helpful as the GBI Syntax Trees are, I have disagreements with them. Randall and Andi are receptive to feedback but there are very different &lt;em&gt;types&lt;/em&gt; of disagreement that can arise in syntactic analysis so I thought I&#39;d start to note down what they are.&lt;/p&gt;
&lt;p&gt;Somethings aren&#39;t disagreements, just corrections. Some are differences of interpretation of the Greek. Some are differences in overall approach.&lt;/p&gt;
&lt;p&gt;Here&#39;s a first attempt at a more refined categorization of types. I&#39;ll call the the person/group who did the initial (published) analysis A1 and the person/group who has the change/disagreement A2.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;I&lt;/strong&gt;. &lt;strong&gt;correction&lt;/strong&gt;—A1 actually agrees with A2 but simply made a mistake and can uncontroversially update their analysis accordingly&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;II&lt;/strong&gt;. &lt;strong&gt;ambiguity&lt;/strong&gt;—both A1 and A2&#39;s analysis is possible in the eyes of the other, but based on other factors, A1 and A2 disagree which analysis to go with. Perhaps this could further be refined into:&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;IIA&lt;/strong&gt;. cases where A1 and A2 each think their own analysis is the &lt;em&gt;more&lt;/em&gt; likely; versus&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;IIB&lt;/strong&gt;. cases where A1 and A2 each their their own analysis is the &lt;em&gt;only&lt;/em&gt; likely.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;III&lt;/strong&gt;. &lt;strong&gt;terminology/framework&lt;/strong&gt;—A1 and A2 agree on structure and relationship up to a certain isomorphism but not in the specifics. This could be further split into:&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;IIIA&lt;/strong&gt;. cases where A1 and A2&#39;s analyses are structurally identical but just different in labels&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;IIIB&lt;/strong&gt;. cases where A1 and A2&#39;s analyses different in structure even though they are derivable from one another&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;IV&lt;/strong&gt;. &lt;strong&gt;irreconcilable&lt;/strong&gt;—A1 and A2 disagree on the way the language actually works and the analyses can&#39;t easily be mapped to one another.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I think many of my disagreements with the GBI Trees so far are of &lt;strong&gt;type IIIB&lt;/strong&gt; which means it is likely possible for me to programmatically generate an alternative analysis with my preferred structure. Indeed, converting to a dependency analysis is a simple example of this but even different choices of head within the constituent structure (which is a major source of systemic disagreement) are easy to make.&lt;/p&gt;
&lt;p&gt;The great thing about &lt;strong&gt;type III&lt;/strong&gt; in general is that even if you disagree with A1, you can still use the analysis to explore the syntactic phenomenon you want (you just have to map your queries to their labels and their conventions).&lt;/p&gt;
&lt;p&gt;I should also note that an important aspect to dealing with this is proper documentation of conventions followed.&lt;/p&gt;
&lt;p&gt;With these thoughts down, I&#39;m now interested in other work that has already been done in this area.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">As helpful as the GBI Syntax Trees are, I have disagreements with them. Randall and Andi are receptive to feedback but there are very different &lt;em&gt;types&lt;/em&gt; of disagreement that can arise in syntactic analysis so I thought I&#39;d start to note down what they are.</summary>
  </entry><entry>
    <title type="html">Converting the GBI Syntax Trees to a Dependency Analysis</title>
    <link href="https://jktauber.com/2015/07/02/converting-gbi-syntax-trees-dependency-analysis/" rel="alternate" type="text/html" title="Converting the GBI Syntax Trees to a Dependency Analysis"/>
    <published>2015-07-02</published>
    <updated>2015-07-02</updated>
    <id>https://jktauber.com/2015/07/02/converting-gbi-syntax-trees-dependency-analysis</id>
    <content type="html" xml:base="https://jktauber.com/2015/07/02/converting-gbi-syntax-trees-dependency-analysis/">&lt;p&gt;With one child on each branch identified as the head, a constituent analysis can be converted to a dependency analysis. Fortunately, the GBI syntax trees have an explicit indication of the head, so I went ahead and converted them to a dependency format.&lt;/p&gt;
&lt;p&gt;Non-leaf nodes in the GBI syntax trees have a &lt;code&gt;Head&lt;/code&gt; attribute which indicates the index of the child considered the head.&lt;/p&gt;
&lt;p&gt;So the algorithm is fairly straightforward. For each leaf-node:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;walk up the tree until you find a node whose &lt;code&gt;Head&lt;/code&gt; attribute is NOT the index of the child we just came from&lt;/li&gt;
&lt;li&gt;follow the &lt;code&gt;Head&lt;/code&gt; attributes back down the tree until you hit another leaf-node&lt;/li&gt;
&lt;li&gt;that second leaf-node is the head of the leaf-node you started on&lt;/li&gt;
&lt;li&gt;the &#34;type&#34; of the dependency is the &lt;code&gt;Cat&lt;/code&gt; of the second-to-last node you visited walking up in step 1.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The only catch is the source data this script uses omits a &lt;code&gt;Head&lt;/code&gt; altogether in three types of cases. The original GBI analysis treated the &lt;code&gt;Head&lt;/code&gt; as being &lt;code&gt;&#34;1&#34;&lt;/code&gt; in these cases so I special case that in the code. I don&#39;t necessarily agree with the choice but it&#39;s easy to change (see below).&lt;/p&gt;
&lt;p&gt;I&#39;ve put the code in a gist: &lt;a href=&#34;https://gist.github.com/jtauber/c02d0928811b7ed21c9a&#34;&gt;https://gist.github.com/jtauber/c02d0928811b7ed21c9a&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The result (on the first part of John 3.16) is:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;64003016001 Οὕτως 64003016003 ADV
64003016002 γὰρ 64003016003 conj
64003016003 ἠγάπησεν None CL
64003016004 ὁ 64003016005 det
64003016005 θεὸς 64003016003 S
64003016006 τὸν 64003016007 det
64003016007 κόσμον 64003016003 O
64003016008 ὥστε 64003016013 conj
64003016009 τὸν 64003016010 det
64003016010 υἱὸν 64003016013 O
64003016011 τὸν 64003016012 det
64003016012 μονογενῆ 64003016010 np
64003016013 ἔδωκεν, 64003016003 CL
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;a href=&#34;/labs/dependency-highlighting.html&#34;&gt;dependency relationship color highlighting&lt;/a&gt; experiment on this site shows a possible way of conveying this dependency information in a text (in this case, 2 John).&lt;/p&gt;
&lt;p&gt;As mentioned, I don&#39;t necessarily always agree with the GBI choice of head, however, it&#39;s fairly straightfoward to alter the code to override the choice of head in certain contexts.&lt;/p&gt;
&lt;p&gt;For example, if you consider the complementizer the head, you can just add code that takes &lt;code&gt;Head=&#34;0&#34;&lt;/code&gt; where &lt;code&gt;Rule=&#34;that-VP&#34;&lt;/code&gt; and so on. Similarly with prepositions, determiners, etc.&lt;/p&gt;
&lt;p&gt;Finally note that it&#39;s not quite possible to reconstruct the original tree from the dependency data because the algorithm effectively eliminates information on some intermediate nodes. Some may consider this an advantage.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">With one child on each branch identified as the head, a constituent analysis can be converted to a dependency analysis. Fortunately, the GBI syntax trees have an explicit indication of the head, so I went ahead and converted them to a dependency format.</summary>
  </entry><entry>
    <title type="html">pyuca supports Python 2 again</title>
    <link href="https://jktauber.com/2015/05/13/pyuca-supports-python-2-again/" rel="alternate" type="text/html" title="pyuca supports Python 2 again"/>
    <published>2015-05-13</published>
    <updated>2015-05-13</updated>
    <id>https://jktauber.com/2015/05/13/pyuca-supports-python-2-again</id>
    <content type="html" xml:base="https://jktauber.com/2015/05/13/pyuca-supports-python-2-again/">&lt;p&gt;Thanks to Chris Beaven, Paul McLanahan and Michal Čihař, Python 2 support is back in pyuca 1.1.&lt;/p&gt;
&lt;p&gt;There was a small amount of complaining about me dropping Python 2 support for the big release of pyuca last year.&lt;/p&gt;
&lt;p&gt;I didn&#39;t have the time or motivation to bring it back, though.&lt;/p&gt;
&lt;p&gt;Fortunately, other people did and thanks to Chris, Paul and Michael, pyuca 1.1 supports Python 2 &lt;em&gt;and&lt;/em&gt; 3.&lt;/p&gt;
&lt;p&gt;The repo is at &lt;a href=&#34;https://github.com/jtauber/pyuca&#34;&gt;https://github.com/jtauber/pyuca&lt;/a&gt; and you can get pyuca from PyPI with &lt;code&gt;pip install pyuca&lt;/code&gt;.&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Thanks to Chris Beaven, Paul McLanahan and Michal Čihař, Python 2 support is back in pyuca 1.1.</summary>
  </entry><entry>
    <title type="html">My BibleTech 2015 Talk</title>
    <link href="https://jktauber.com/2015/05/06/my-bibletech-2015-talk/" rel="alternate" type="text/html" title="My BibleTech 2015 Talk"/>
    <published>2015-05-06</published>
    <updated>2015-05-06</updated>
    <id>https://jktauber.com/2015/05/06/my-bibletech-2015-talk</id>
    <content type="html" xml:base="https://jktauber.com/2015/05/06/my-bibletech-2015-talk/">&lt;p&gt;BibleTech talks were not recorded but I turned on my iPhone&#39;s Voice Memo recording and later sync&#39;d the audio with my slides to make this video.&lt;/p&gt;
&lt;iframe src=&#34;https://player.vimeo.com/video/127114639&#34; width=&#34;500&#34; height=&#34;375&#34; frameborder=&#34;0&#34; webkitallowfullscreen mozallowfullscreen allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;The abstract:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In an update on the ongoing work he has spoken about in previous Bible Tech conferences, James will talk about recent developments in open source learning software and the MorphGNT linguistic database, and how the two work together to provide tools for improving the learning of New Testament Greek.&lt;/p&gt;
&lt;/blockquote&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">BibleTech talks were not recorded but I turned on my iPhone&#39;s Voice Memo recording and later sync&#39;d the audio with my slides to make this video.</summary>
  </entry><entry>
    <title type="html">Version 1.0 of pyuca released</title>
    <link href="https://jktauber.com/2014/02/01/version-10-pyuca-released/" rel="alternate" type="text/html" title="Version 1.0 of pyuca released"/>
    <published>2014-02-01</published>
    <updated>2014-02-01</updated>
    <id>https://jktauber.com/2014/02/01/version-10-pyuca-released</id>
    <content type="html" xml:base="https://jktauber.com/2014/02/01/version-10-pyuca-released/">&lt;p&gt;pyuca is my pure Python implementation of the Unicode Collation Algorithm (for sorting, amongst other things, Greek).&lt;/p&gt;
&lt;p&gt;I&#39;ve just released version 1.0 for Python 3.3 and above, and it passes 100% of the UCA conformances tests.&lt;/p&gt;
&lt;p&gt;I implemented enough back in 2006 to be able to sort Ancient Greek and released it on PyPI in 2012.&lt;/p&gt;
&lt;p&gt;Since then, with input from others, I&#39;ve made various improvements but in October last year I decided to start testing against the comprehensive UCA conformance tests provided by the Unicode Consortium. The last couple of days I&#39;ve had an intense sprint where I got 100% of the tests passing and also 100% code coverage.&lt;/p&gt;
&lt;p&gt;I also made the decision to ditch Python 2 support as part of my encouragement to get people to move to Python 3.&lt;/p&gt;
&lt;p&gt;The repo is available at &lt;a href=&#34;https://github.com/jtauber/pyuca/&#34;&gt;https://github.com/jtauber/pyuca/&lt;/a&gt; but you can most easily get pyuca with&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;pip install pyuca
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and then use it as follows:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code class=&#34;language-python&#34;&gt;from pyuca import Collator
c = Collator(&amp;quot;allkeys.txt&amp;quot;)

sorted_words = sorted(words, key=c.sort_key)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;strong&gt;UPDATE (2015-05-13)&lt;/strong&gt;: &lt;a href=&#34;/2015/05/13/pyuca-supports-python-2-again/&#34;&gt;Python 2 support is back in 1.1&lt;/a&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">pyuca is my pure Python implementation of the Unicode Collation Algorithm (for sorting, amongst other things, Greek).</summary>
  </entry><entry>
    <title type="html">Rebasing MorphGNT off SBLGNT</title>
    <link href="https://jktauber.com/2011/01/18/rebasing-morphgnt-sblgnt/" rel="alternate" type="text/html" title="Rebasing MorphGNT off SBLGNT"/>
    <published>2011-01-18</published>
    <updated>2011-01-18</updated>
    <id>https://jktauber.com/2011/01/18/rebasing-morphgnt-sblgnt</id>
    <content type="html" xml:base="https://jktauber.com/2011/01/18/rebasing-morphgnt-sblgnt/">&lt;p&gt;The last three months, I&#39;ve been working on rebasing the MorphGNT database off the SBLGNT text rather than the UBS3.&lt;/p&gt;
&lt;p&gt;While I have had permission to work with the CCAT database for over a decade, the fact the UBS3 text can be extracted from it has always been problematic. The existence of the SBLGNT solves the problem of having a critical text with clear licensing and so, in October 2010, I started the process of moving the MorphGNT analysis to the SBLGNT text.&lt;/p&gt;
&lt;p&gt;This task is mostly done and the work-in-progress is available on GitHub at &lt;a href=&#34;https://github.com/morphgnt/sblgnt&#34;&gt;https://github.com/morphgnt/sblgnt&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It was a three step process, done one book at a time.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A Python script was used to do a first-pass alignment. The script allowed for differences in punctuation, accentuation, capitalization and movable-nu.&lt;/li&gt;
&lt;li&gt;Any differences were then manually inspected and corrected. In 90% of cases it was a simple re-ordering of words but in the other 10%, a fresh analysis had to be made. These analyses were then checked against various sources such as BDAG, Perseus and the Lexham Reverse Interlinear.&lt;/li&gt;
&lt;li&gt;Finally, I wrote another Python script that checked various heuristics&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I&#39;m in the process of making a batch of corrections based on the third step and then I&#39;ll formally release what will be called MorphGNT 6.0 (although possibly as a beta such as 6.0b1).&lt;/p&gt;
&lt;p&gt;The next step (which I&#39;ve started in parallel) will merge in the Robinson analysis and parse codes on the road to a completely new set of parse codes for MorphGNT 7.0.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on morphgnt.org&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">The last three months, I&#39;ve been working on rebasing the MorphGNT database off the SBLGNT text rather than the UBS3.</summary>
  </entry><entry>
    <title type="html">Inline Replacement for John 2</title>
    <link href="https://jktauber.com/2010/04/25/inline-replacement-john-2/" rel="alternate" type="text/html" title="Inline Replacement for John 2"/>
    <published>2010-04-25</published>
    <updated>2010-04-25</updated>
    <id>https://jktauber.com/2010/04/25/inline-replacement-john-2</id>
    <content type="html" xml:base="https://jktauber.com/2010/04/25/inline-replacement-john-2/">&lt;p&gt;A post to the graded-reader mailing list from April 25, 2010.&lt;/p&gt;
&lt;p&gt;This afternoon and evening, I updated and open sourced my code for doing inline replacement and did a rough literal translation John 2, marked up with the PROIEL clause (and in some cases phrase) boundaries.&lt;/p&gt;
&lt;p&gt;I then just ran a next-best ordering based on forms only, with the targets that are PRED or multi-word SUB (adding the latter works quite well)&lt;/p&gt;
&lt;p&gt;I&#39;ve included the complete results below. The main outstanding issue is it doesn&#39;t yet properly handle discontinuous clauses (the parenthetical in 2.9) or clauses that span verses (2.9,2.10; 2.14,2.15,2.16; 2.24,2.25).&lt;/p&gt;
&lt;p&gt;All the code (and my annotated translation) are available on github.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;[343427] John 2.2&lt;br /&gt;
&lt;strong&gt;ὁ Ἰησοῦς&lt;/strong&gt; and his disciples were invited to the wedding&lt;/p&gt;
&lt;p&gt;[343464] John 2.4&lt;br /&gt;
&lt;strong&gt;ὁ Ἰησοῦς&lt;/strong&gt; says to her , what (concern is that) to me and you , woman ? My hour is not yet come&lt;/p&gt;
&lt;p&gt;[343517] John 2.7&lt;br /&gt;
&lt;strong&gt;ὁ Ἰησοῦς&lt;/strong&gt; says to them : fill the water-jars with water and they filled them up to the top&lt;/p&gt;
&lt;p&gt;[343607] John 2.11&lt;br /&gt;
This beginning of signs &lt;strong&gt;ὁ Ἰησοῦς&lt;/strong&gt; did in Cana of Galilee and revealed his glory and his disciples believed in him&lt;/p&gt;
&lt;p&gt;[343665] John 2.13&lt;br /&gt;
and near was the passover of the Jews and &lt;strong&gt;ὁ Ἰησοῦς&lt;/strong&gt; went up to Jerusalem&lt;/p&gt;
&lt;p&gt;[343841] John 2.22&lt;br /&gt;
so when he was raised from the dead , his disciples remembered that he was saying this and they believed the Scripture and the word which &lt;strong&gt;ὁ Ἰησοῦς&lt;/strong&gt; said&lt;/p&gt;
&lt;p&gt;[343430] John 2.2&lt;br /&gt;
ὁ Ἰησοῦς and &lt;strong&gt;οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt; were invited to the wedding&lt;/p&gt;
&lt;p&gt;[343623] John 2.11&lt;br /&gt;
This beginning of signs ὁ Ἰησοῦς did in Cana of Galilee and revealed his glory and &lt;strong&gt;οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt; believed in him&lt;/p&gt;
&lt;p&gt;[343642] John 2.12&lt;br /&gt;
After this , he and his mother and his brothers and &lt;strong&gt;οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt; went down into Capernaum and there they remained not many days&lt;/p&gt;
&lt;p&gt;[343736] John 2.17&lt;br /&gt;
&lt;strong&gt;οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt; remembered that it has been written : the zeal for your house will devour me&lt;/p&gt;
&lt;p&gt;[343825] John 2.22&lt;br /&gt;
so when he was raised from the dead , &lt;strong&gt;οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt; remembered that he was saying this and they believed the Scripture and the word which ὁ Ἰησοῦς said&lt;/p&gt;
&lt;p&gt;[343753] John 2.18&lt;br /&gt;
so &lt;strong&gt;οἱ Ἰουδαῖοι&lt;/strong&gt; answered and said to him : what sign are you showing us that you do these things ?&lt;/p&gt;
&lt;p&gt;[343788] John 2.20&lt;br /&gt;
so &lt;strong&gt;οἱ Ἰουδαῖοι&lt;/strong&gt; said : this temple was built in forty-six years and you will raise it in three days ?&lt;/p&gt;
&lt;p&gt;[343549] John 2.9,2.10&lt;br /&gt;
as &lt;strong&gt;ὁ ἀρχιτρίκλινος&lt;/strong&gt; tasted the water having become wine and didn&#39;t know from where it came ( but the servants who drew the water knew ) the head-steward calls the groom and says to him : all men first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343574] John 2.9,2.10&lt;br /&gt;
as ὁ ἀρχιτρίκλινος tasted the water having become wine and didn&#39;t know from where it came ( but the servants who drew the water knew ) &lt;strong&gt;ὁ ἀρχιτρίκλινος&lt;/strong&gt; calls the groom and says to him : all men first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343514] John 2.7&lt;br /&gt;
&lt;strong&gt;λέγει αὐτοῖς ὁ Ἰησοῦς&lt;/strong&gt; : fill the water-jars with water and they filled them up to the top&lt;/p&gt;
&lt;p&gt;[343531] John 2.8&lt;br /&gt;
&lt;strong&gt;καὶ λέγει αὐτοῖς&lt;/strong&gt; , draw now and carry it to the head-steward and they brought it&lt;/p&gt;
&lt;p&gt;[343428] John 2.2&lt;br /&gt;
&lt;strong&gt;καὶ ὁ Ἰησοῦς καὶ οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt; were invited to the wedding&lt;/p&gt;
&lt;p&gt;[343481] John 2.5&lt;br /&gt;
&lt;strong&gt;ἡ μήτηρ αὐτοῦ&lt;/strong&gt; says to the servants : do whatever he tells you to&lt;/p&gt;
&lt;p&gt;[343634] John 2.12&lt;br /&gt;
After this , he and &lt;strong&gt;ἡ μήτηρ αὐτοῦ&lt;/strong&gt; and his brothers and οἱ μαθηταὶ αὐτοῦ went down into Capernaum and there they remained not many days&lt;/p&gt;
&lt;p&gt;[343418] John 2.1&lt;br /&gt;
And on the third day , a wedding was happening in Cana of Galilee and &lt;strong&gt;ἡ μήτηρ τοῦ Ἰησοῦ&lt;/strong&gt; was there&lt;/p&gt;
&lt;p&gt;[343451] John 2.3&lt;br /&gt;
There was no wine because the wedding wine had been finished off . Then &lt;strong&gt;ἡ μήτηρ τοῦ Ἰησοῦ&lt;/strong&gt; says to him : there is no wine&lt;/p&gt;
&lt;p&gt;[343576] John 2.9,2.10&lt;br /&gt;
as ὁ ἀρχιτρίκλινος tasted the water having become wine and didn&#39;t know from where it came ( but the servants who drew the water knew ) ὁ ἀρχιτρίκλινος calls the groom &lt;strong&gt;λέγει αὐτῷ&lt;/strong&gt; all men first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343755] John 2.18&lt;br /&gt;
so οἱ Ἰουδαῖοι answered and &lt;strong&gt;εἶπαν αὐτῷ&lt;/strong&gt; : what sign are you showing us that you do these things ?&lt;/p&gt;
&lt;p&gt;[343785] John 2.20&lt;br /&gt;
&lt;strong&gt;εἶπαν οὖν οἱ Ἰουδαῖοι&lt;/strong&gt; : this temple was built in forty-six years and you will raise it in three days ?&lt;/p&gt;
&lt;p&gt;[343750] John 2.18&lt;br /&gt;
&lt;strong&gt;ἀπεκρίθησαν οὖν οἱ Ἰουδαῖοι&lt;/strong&gt; and εἶπαν αὐτῷ : what sign are you showing us that you do these things ?&lt;/p&gt;
&lt;p&gt;[343754] John 2.18&lt;br /&gt;
&lt;strong&gt;ἀπεκρίθησαν οὖν οἱ Ἰουδαῖοι καὶ εἶπαν αὐτῷ&lt;/strong&gt; : what sign are you showing us that you do these things ?&lt;/p&gt;
&lt;p&gt;[343770] John 2.19&lt;br /&gt;
Jesus answered and &lt;strong&gt;εἶπεν αὐτοῖς&lt;/strong&gt; : destroy this temple and in three days I will raise it&lt;/p&gt;
&lt;p&gt;[343767] John 2.19&lt;br /&gt;
&lt;strong&gt;ἀπεκρίθη Ἰησοῦς&lt;/strong&gt; and εἶπεν αὐτοῖς : destroy this temple and in three days I will raise it&lt;/p&gt;
&lt;p&gt;[343769] John 2.19&lt;br /&gt;
&lt;strong&gt;ἀπεκρίθη Ἰησοῦς καὶ εἶπεν αὐτοῖς&lt;/strong&gt; : destroy this temple and in three days I will raise it&lt;/p&gt;
&lt;p&gt;[343872] John 2.24,2.25&lt;br /&gt;
&lt;strong&gt;αὐτὸς Ἰησοῦς&lt;/strong&gt; did not entrust himself to them because he knows everyone and because he had no need that anyone should testify about man for he knew what was in man&lt;/p&gt;
&lt;p&gt;[343638] John 2.12&lt;br /&gt;
After this , he and ἡ μήτηρ αὐτοῦ and &lt;strong&gt;οἱ ἀδελφοὶ αὐτοῦ&lt;/strong&gt; and οἱ μαθηταὶ αὐτοῦ went down into Capernaum and there they remained not many days&lt;/p&gt;
&lt;p&gt;[343632] John 2.12&lt;br /&gt;
After this , &lt;strong&gt;αὐτὸς καὶ ἡ μήτηρ αὐτοῦ καὶ οἱ ἀδελφοὶ αὐτοῦ καὶ οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt; went down into Capernaum and there they remained not many days&lt;/p&gt;
&lt;p&gt;[343444] John 2.3&lt;br /&gt;
There was no wine because &lt;strong&gt;ὁ οἶνος τοῦ γάμου&lt;/strong&gt; had been finished off . Then ἡ μήτηρ τοῦ Ἰησοῦ says to him : there is no wine&lt;/p&gt;
&lt;p&gt;[343442] John 2.3&lt;br /&gt;
There was no wine because &lt;strong&gt;συνετελέσθη ὁ οἶνος τοῦ γάμου&lt;/strong&gt; . Then ἡ μήτηρ τοῦ Ἰησοῦ says to him : there is no wine&lt;/p&gt;
&lt;p&gt;[343461] John 2.4&lt;br /&gt;
&lt;strong&gt;λέγει αὐτῇ ὁ Ἰησοῦς&lt;/strong&gt; , what (concern is that) to me and you , woman ? My hour is not yet come&lt;/p&gt;
&lt;p&gt;[343765] John 2.18&lt;br /&gt;
ἀπεκρίθησαν οὖν οἱ Ἰουδαῖοι καὶ εἶπαν αὐτῷ : what sign are you showing us that &lt;strong&gt;ταῦτα ποιεῖς&lt;/strong&gt; ?&lt;/p&gt;
&lt;p&gt;[343459] John 2.3&lt;br /&gt;
There was no wine because συνετελέσθη ὁ οἶνος τοῦ γάμου . Then ἡ μήτηρ τοῦ Ἰησοῦ says to him : &lt;strong&gt;οἶνος οὐκ ἔστιν&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343740] John 2.17&lt;br /&gt;
οἱ μαθηταὶ αὐτοῦ remembered that &lt;strong&gt;γεγραμμένον ἐστίν&lt;/strong&gt; : the zeal for your house will devour me&lt;/p&gt;
&lt;p&gt;[343734] John 2.17&lt;br /&gt;
&lt;strong&gt;ἐμνήσθησαν οἱ μαθηταὶ αὐτοῦ ὅτι γεγραμμένον ἐστίν&lt;/strong&gt; : the zeal for your house will devour me&lt;/p&gt;
&lt;p&gt;[343476] John 2.4&lt;br /&gt;
λέγει αὐτῇ ὁ Ἰησοῦς , what (concern is that) to me and you , woman ? &lt;strong&gt;ἡ ὥρα μου&lt;/strong&gt; is not yet come&lt;/p&gt;
&lt;p&gt;[343543] John 2.8&lt;br /&gt;
καὶ λέγει αὐτοῖς , draw now and carry it to the head-steward &lt;strong&gt;οἱ δὲ ἤνεγκαν&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343829] John 2.22&lt;br /&gt;
so when he was raised from the dead , οἱ μαθηταὶ αὐτοῦ remembered that &lt;strong&gt;τοῦτο ἔλεγεν&lt;/strong&gt; and they believed the Scripture and the word which ὁ Ἰησοῦς said&lt;/p&gt;
&lt;p&gt;[343479] John 2.5&lt;br /&gt;
&lt;strong&gt;λέγει ἡ μήτηρ αὐτοῦ τοῖς διακόνοις&lt;/strong&gt; : do whatever he tells you to&lt;/p&gt;
&lt;p&gt;[343439] John 2.3&lt;br /&gt;
&lt;strong&gt;καὶ οἶνον οὐκ εἶχον ὅτι συνετελέσθη ὁ οἶνος τοῦ γάμου&lt;/strong&gt; . Then ἡ μήτηρ τοῦ Ἰησοῦ says to him : οἶνος οὐκ ἔστιν&lt;/p&gt;
&lt;p&gt;[343416] John 2.1&lt;br /&gt;
And on the third day , a wedding was happening in Cana of Galilee and &lt;strong&gt;ἦν ἡ μήτηρ τοῦ Ἰησοῦ ἐκεῖ&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343580] John 2.10&lt;br /&gt;
λέγει αὐτῷ &lt;strong&gt;πᾶς ἄνθρωπος&lt;/strong&gt; first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343534] John 2.8&lt;br /&gt;
καὶ λέγει αὐτοῖς , &lt;strong&gt;ἀντλήσατε νῦν&lt;/strong&gt; and carry it to the head-steward οἱ δὲ ἤνεγκαν&lt;/p&gt;
&lt;p&gt;[343537] John 2.8&lt;br /&gt;
καὶ λέγει αὐτοῖς , ἀντλήσατε νῦν and &lt;strong&gt;φέρετε τῷ ἀρχιτρικλίνῳ&lt;/strong&gt; οἱ δὲ ἤνεγκαν&lt;/p&gt;
&lt;p&gt;[343536] John 2.8&lt;br /&gt;
καὶ λέγει αὐτοῖς , &lt;strong&gt;ἀντλήσατε νῦν καὶ φέρετε τῷ ἀρχιτρικλίνῳ&lt;/strong&gt; οἱ δὲ ἤνεγκαν&lt;/p&gt;
&lt;p&gt;[343619] John 2.11&lt;br /&gt;
This beginning of signs ὁ Ἰησοῦς did in Cana of Galilee and revealed his glory and &lt;strong&gt;ἐπίστευσαν εἰς αὐτὸν οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343557] John 2.9,2.10&lt;br /&gt;
as ὁ ἀρχιτρίκλινος tasted the water having become wine and &lt;strong&gt;οὐκ ᾔδει πόθεν ἐστίν&lt;/strong&gt; ( but the servants who drew the water knew ) ὁ ἀρχιτρίκλινος calls the groom λέγει αὐτῷ πᾶς ἄνθρωπος first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343547] John 2.9,2.10&lt;br /&gt;
as &lt;strong&gt;ἐγεύσατο ὁ ἀρχιτρίκλινος τὸ ὕδωρ οἶνον γεγενημένον&lt;/strong&gt; and οὐκ ᾔδει πόθεν ἐστίν ( but the servants who drew the water knew ) ὁ ἀρχιτρίκλινος calls the groom λέγει αὐτῷ πᾶς ἄνθρωπος first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343555] John 2.9,2.10&lt;br /&gt;
as &lt;strong&gt;ἐγεύσατο ὁ ἀρχιτρίκλινος τὸ ὕδωρ οἶνον γεγενημένον καὶ οὐκ ᾔδει πόθεν ἐστίν&lt;/strong&gt; ( but the servants who drew the water knew ) ὁ ἀρχιτρίκλινος calls the groom λέγει αὐτῷ πᾶς ἄνθρωπος first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343796] John 2.20&lt;br /&gt;
εἶπαν οὖν οἱ Ἰουδαῖοι : &lt;strong&gt;ὁ ναὸς οὗτος&lt;/strong&gt; was built in forty-six years and you will raise it in three days ?&lt;/p&gt;
&lt;p&gt;[343661] John 2.13&lt;br /&gt;
and near was the passover of the Jews and &lt;strong&gt;ἀνέβη εἰς Ἱεροσόλυμα ὁ Ἰησοῦς&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343656] John 2.13&lt;br /&gt;
and near was &lt;strong&gt;τὸ πάσχα τῶν Ἰουδαίων&lt;/strong&gt; and ἀνέβη εἰς Ἱεροσόλυμα ὁ Ἰησοῦς&lt;/p&gt;
&lt;p&gt;[343654] John 2.13&lt;br /&gt;
&lt;strong&gt;Καὶ ἐγγὺς ἦν τὸ πάσχα τῶν Ἰουδαίων&lt;/strong&gt; and ἀνέβη εἰς Ἱεροσόλυμα ὁ Ἰησοῦς&lt;/p&gt;
&lt;p&gt;[343660] John 2.13&lt;br /&gt;
&lt;strong&gt;Καὶ ἐγγὺς ἦν τὸ πάσχα τῶν Ἰουδαίων καὶ ἀνέβη εἰς Ἱεροσόλυμα ὁ Ἰησοῦς&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343718] John 2.14,2.15,2.16&lt;br /&gt;
he found, sitting in the temple , the ones selling oxen and sheep and doves , and the coin-dealers and, having made a whip out of ropes , he threw out of the temple all the sheep and the oxen and he threw out the coins of the money-changers and he overturned the tables and &lt;strong&gt;τοῖς τὰς περιστερὰς πωλοῦσιν εἶπεν&lt;/strong&gt; take these things from here . don&#39;t make my father&#39;s house a market-place&lt;/p&gt;
&lt;p&gt;[343711] John 2.14,2.15,2.16&lt;br /&gt;
he found, sitting in the temple , the ones selling oxen and sheep and doves , and the coin-dealers and, having made a whip out of ropes , he threw out of the temple all the sheep and the oxen and he threw out the coins of the money-changers and &lt;strong&gt;τὰς τραπέζας ἀνέστρεψεν&lt;/strong&gt; and τοῖς τὰς περιστερὰς πωλοῦσιν εἶπεν take these things from here . don&#39;t make my father&#39;s house a market-place&lt;/p&gt;
&lt;p&gt;[343423] John 2.2&lt;br /&gt;
&lt;strong&gt;ἐκλήθη δὲ καὶ ὁ Ἰησοῦς καὶ οἱ μαθηταὶ αὐτοῦ εἰς τὸν γάμον&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343720] John 2.16&lt;br /&gt;
and τοῖς τὰς περιστερὰς πωλοῦσιν εἶπεν &lt;strong&gt;ἄρατε ταῦτα ἐντεῦθεν&lt;/strong&gt; . don&#39;t make my father&#39;s house a market-place&lt;/p&gt;
&lt;p&gt;[343570] John 2.9,2.10&lt;br /&gt;
as ἐγεύσατο ὁ ἀρχιτρίκλινος τὸ ὕδωρ οἶνον γεγενημένον καὶ οὐκ ᾔδει πόθεν ἐστίν ( but the servants who drew the water knew ) ὁ ἀρχιτρίκλινος calls the groom λέγει αὐτῷ πᾶς ἄνθρωπος first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343575] John 2.9,2.10&lt;br /&gt;
as ἐγεύσατο ὁ ἀρχιτρίκλινος τὸ ὕδωρ οἶνον γεγενημένον καὶ οὐκ ᾔδει πόθεν ἐστίν ( but the servants who drew the water knew ) ὁ ἀρχιτρίκλινος calls the groom λέγει αὐτῷ πᾶς ἄνθρωπος first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343563] John 2.9,2.10&lt;br /&gt;
as ἐγεύσατο ὁ ἀρχιτρίκλινος τὸ ὕδωρ οἶνον γεγενημένον καὶ οὐκ ᾔδει πόθεν ἐστίν ( &lt;strong&gt;οἱ διάκονοι οἱ ἠντληκότες τὸ ὕδωρ&lt;/strong&gt; knew ) ὁ ἀρχιτρίκλινος calls the groom λέγει αὐτῷ πᾶς ἄνθρωπος first put out the good wine and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343474] John 2.4&lt;br /&gt;
λέγει αὐτῇ ὁ Ἰησοῦς , what (concern is that) to me and you , woman ? &lt;strong&gt;οὔπω ἥκει ἡ ὥρα μου&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[398696] John 2.4&lt;br /&gt;
λέγει αὐτῇ ὁ Ἰησοῦς , &lt;strong&gt;τί ἐμοὶ καὶ σοί&lt;/strong&gt; , woman ? οὔπω ἥκει ἡ ὥρα μου&lt;/p&gt;
&lt;p&gt;[343819] John 2.22&lt;br /&gt;
so when &lt;strong&gt;ἠγέρθη ἐκ νεκρῶν&lt;/strong&gt; , οἱ μαθηταὶ αὐτοῦ remembered that τοῦτο ἔλεγεν and they believed the Scripture and the word which ὁ Ἰησοῦς said&lt;/p&gt;
&lt;p&gt;[343823] John 2.22&lt;br /&gt;
&lt;strong&gt;ὅτε οὖν ἠγέρθη ἐκ νεκρῶν ἐμνήσθησαν οἱ μαθηταὶ αὐτοῦ ὅτι τοῦτο ἔλεγεν&lt;/strong&gt; and they believed the Scripture and the word which ὁ Ἰησοῦς said&lt;/p&gt;
&lt;p&gt;[343845] John 2.23&lt;br /&gt;
when &lt;strong&gt;δὲ ἦν ἐν τοῖς Ἱεροσολύμοις ἐν τῷ πάσχα ἐν τῇ ἑορτῇ&lt;/strong&gt; , many believed in his name , seeing his signs which he was doing&lt;/p&gt;
&lt;p&gt;[343832] John 2.22&lt;br /&gt;
ὅτε οὖν ἠγέρθη ἐκ νεκρῶν ἐμνήσθησαν οἱ μαθηταὶ αὐτοῦ ὅτι τοῦτο ἔλεγεν and &lt;strong&gt;ἐπίστευσαν τῇ γραφῇ καὶ τῷ λόγῳ ὃν εἶπεν ὁ Ἰησοῦς&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343831] John 2.22&lt;br /&gt;
&lt;strong&gt;ὅτε οὖν ἠγέρθη ἐκ νεκρῶν ἐμνήσθησαν οἱ μαθηταὶ αὐτοῦ ὅτι τοῦτο ἔλεγεν καὶ ἐπίστευσαν τῇ γραφῇ καὶ τῷ λόγῳ ὃν εἶπεν ὁ Ἰησοῦς&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343449] John 2.3&lt;br /&gt;
καὶ οἶνον οὐκ εἶχον ὅτι συνετελέσθη ὁ οἶνος τοῦ γάμου . &lt;strong&gt;εἶτα λέγει ἡ μήτηρ τοῦ Ἰησοῦ πρὸς αὐτόν&lt;/strong&gt; : οἶνος οὐκ ἔστιν&lt;/p&gt;
&lt;p&gt;[343782] John 2.19&lt;br /&gt;
ἀπεκρίθη Ἰησοῦς καὶ εἶπεν αὐτοῖς : destroy this temple and &lt;strong&gt;ἐν τρισὶν ἡμέραις ἐγερῶ αὐτόν&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343804] John 2.20&lt;br /&gt;
εἶπαν οὖν οἱ Ἰουδαῖοι : ὁ ναὸς οὗτος was built in forty-six years and &lt;strong&gt;σὺ ἐν τρισὶν ἡμέραις ἐγερεῖς αὐτόν&lt;/strong&gt; ?&lt;/p&gt;
&lt;p&gt;[343773] John 2.19&lt;br /&gt;
ἀπεκρίθη Ἰησοῦς καὶ εἶπεν αὐτοῖς : &lt;strong&gt;λύσατε τὸν ναὸν τοῦτον&lt;/strong&gt; and ἐν τρισὶν ἡμέραις ἐγερῶ αὐτόν&lt;/p&gt;
&lt;p&gt;[343778] John 2.19&lt;br /&gt;
ἀπεκρίθη Ἰησοῦς καὶ εἶπεν αὐτοῖς : &lt;strong&gt;λύσατε τὸν ναὸν τοῦτον καὶ ἐν τρισὶν ἡμέραις ἐγερῶ αὐτόν&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343585] John 2.10&lt;br /&gt;
λέγει αὐτῷ &lt;strong&gt;πᾶς ἄνθρωπος πρῶτον τὸν καλὸν οἶνον τίθησιν&lt;/strong&gt; and when they are drunk , the inferior . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[398697] John 2.10&lt;br /&gt;
λέγει αὐτῷ πᾶς ἄνθρωπος πρῶτον τὸν καλὸν οἶνον τίθησιν and &lt;strong&gt;ὅταν μεθυσθῶσιν τὸν ἐλάσσω&lt;/strong&gt; . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343587] John 2.10&lt;br /&gt;
λέγει αὐτῷ &lt;strong&gt;πᾶς ἄνθρωπος πρῶτον τὸν καλὸν οἶνον τίθησιν καὶ ὅταν μεθυσθῶσιν τὸν ἐλάσσω&lt;/strong&gt; . you have kept the good wine until now&lt;/p&gt;
&lt;p&gt;[343594] John 2.10&lt;br /&gt;
λέγει αὐτῷ πᾶς ἄνθρωπος πρῶτον τὸν καλὸν οἶνον τίθησιν καὶ ὅταν μεθυσθῶσιν τὸν ἐλάσσω . &lt;strong&gt;σὺ τετήρηκας τὸν καλὸν οἶνον ἕως ἄρτι&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343743] John 2.17&lt;br /&gt;
ἐμνήσθησαν οἱ μαθηταὶ αὐτοῦ ὅτι γεγραμμένον ἐστίν : &lt;strong&gt;ὁ ζῆλος τοῦ οἴκου σου&lt;/strong&gt; will devour me&lt;/p&gt;
&lt;p&gt;[343747] John 2.17&lt;br /&gt;
ἐμνήσθησαν οἱ μαθηταὶ αὐτοῦ ὅτι γεγραμμένον ἐστίν : &lt;strong&gt;ὁ ζῆλος τοῦ οἴκου σου καταφάγεταί με&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343628] John 2.12&lt;br /&gt;
&lt;strong&gt;Μετὰ τοῦτο κατέβη εἰς Καφαρναοὺμ αὐτὸς καὶ ἡ μήτηρ αὐτοῦ καὶ οἱ ἀδελφοὶ αὐτοῦ καὶ οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt; and there they remained not many days&lt;/p&gt;
&lt;p&gt;[343647] John 2.12&lt;br /&gt;
Μετὰ τοῦτο κατέβη εἰς Καφαρναοὺμ αὐτὸς καὶ ἡ μήτηρ αὐτοῦ καὶ οἱ ἀδελφοὶ αὐτοῦ καὶ οἱ μαθηταὶ αὐτοῦ and &lt;strong&gt;ἐκεῖ ἔμειναν οὐ πολλὰς ἡμέρας&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343645] John 2.12&lt;br /&gt;
&lt;strong&gt;Μετὰ τοῦτο κατέβη εἰς Καφαρναοὺμ αὐτὸς καὶ ἡ μήτηρ αὐτοῦ καὶ οἱ ἀδελφοὶ αὐτοῦ καὶ οἱ μαθηταὶ αὐτοῦ καὶ ἐκεῖ ἔμειναν οὐ πολλὰς ἡμέρας&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343890] John 2.24,2.25&lt;br /&gt;
αὐτὸς Ἰησοῦς did not entrust himself to them because he knows everyone and because he had no need that &lt;strong&gt;τις μαρτυρήσῃ περὶ τοῦ ἀνθρώπου&lt;/strong&gt; for he knew what was in man&lt;/p&gt;
&lt;p&gt;[343887] John 2.24,2.25&lt;br /&gt;
αὐτὸς Ἰησοῦς did not entrust himself to them because he knows everyone and because &lt;strong&gt;οὐ χρείαν εἶχεν ἵνα τις μαρτυρήσῃ περὶ τοῦ ἀνθρώπου&lt;/strong&gt; for he knew what was in man&lt;/p&gt;
&lt;p&gt;[343794] John 2.20&lt;br /&gt;
εἶπαν οὖν οἱ Ἰουδαῖοι : &lt;strong&gt;τεσσεράκοντα καὶ ἓξ ἔτεσιν οἰκοδομήθη ὁ ναὸς οὗτος&lt;/strong&gt; and σὺ ἐν τρισὶν ἡμέραις ἐγερεῖς αὐτόν ?&lt;/p&gt;
&lt;p&gt;[343799] John 2.20&lt;br /&gt;
εἶπαν οὖν οἱ Ἰουδαῖοι : &lt;strong&gt;τεσσεράκοντα καὶ ἓξ ἔτεσιν οἰκοδομήθη ὁ ναὸς οὗτος καὶ σὺ ἐν τρισὶν ἡμέραις ἐγερεῖς αὐτόν&lt;/strong&gt; ?&lt;/p&gt;
&lt;p&gt;[343613] John 2.11&lt;br /&gt;
This beginning of signs ὁ Ἰησοῦς did in Cana of Galilee and &lt;strong&gt;ἐφανέρωσεν τὴν δόξαν αὐτοῦ&lt;/strong&gt; and ἐπίστευσαν εἰς αὐτὸν οἱ μαθηταὶ αὐτοῦ&lt;/p&gt;
&lt;p&gt;[343705] John 2.14,2.15,2.16&lt;br /&gt;
he found, sitting in the temple , the ones selling oxen and sheep and doves , and the coin-dealers and, having made a whip out of ropes , he threw out of the temple all the sheep and the oxen and &lt;strong&gt;τῶν κολλυβιστῶν ἐξέχεεν τὸ κέρμα&lt;/strong&gt; and τὰς τραπέζας ἀνέστρεψεν and τοῖς τὰς περιστερὰς πωλοῦσιν εἶπεν ἄρατε ταῦτα ἐντεῦθεν . don&#39;t make my father&#39;s house a market-place&lt;/p&gt;
&lt;p&gt;[343809] John 2.21&lt;br /&gt;
&lt;strong&gt;ἐκεῖνος δὲ ἔλεγεν περὶ τοῦ ναοῦ τοῦ σώματος αὐτοῦ&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343897] John 2.25&lt;br /&gt;
and because οὐ χρείαν εἶχεν ἵνα τις μαρτυρήσῃ περὶ τοῦ ἀνθρώπου &lt;strong&gt;αὐτὸς γὰρ ἐγίνωσκεν τί ἦν ἐν τῷ ἀνθρώπῳ&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343760] John 2.18&lt;br /&gt;
ἀπεκρίθησαν οὖν οἱ Ἰουδαῖοι καὶ εἶπαν αὐτῷ : &lt;strong&gt;τί σημεῖον δεικνύεις ἡμῖν ὅτι ταῦτα ποιεῖς&lt;/strong&gt; ?&lt;/p&gt;
&lt;p&gt;[343519] John 2.7&lt;br /&gt;
λέγει αὐτοῖς ὁ Ἰησοῦς : &lt;strong&gt;γεμίσατε τὰς ὑδρίας ὕδατος&lt;/strong&gt; and they filled them up to the top&lt;/p&gt;
&lt;p&gt;[343525] John 2.7&lt;br /&gt;
λέγει αὐτοῖς ὁ Ἰησοῦς : γεμίσατε τὰς ὑδρίας ὕδατος &lt;strong&gt;καὶ ἐγέμισαν αὐτὰς ἕως ἄνω&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343874] John 2.24,2.25&lt;br /&gt;
αὐτὸς Ἰησοῦς did not entrust himself to them because he knows everyone and because οὐ χρείαν εἶχεν ἵνα τις μαρτυρήσῃ περὶ τοῦ ἀνθρώπου αὐτὸς γὰρ ἐγίνωσκεν τί ἦν ἐν τῷ ἀνθρώπῳ&lt;/p&gt;
&lt;p&gt;[343409] John 2.1&lt;br /&gt;
&lt;strong&gt;Καὶ τῇ ἡμέρᾳ τῇ τρίτῃ γάμος ἐγένετο ἐν Κανὰ τῆς Γαλιλαίας&lt;/strong&gt; and ἦν ἡ μήτηρ τοῦ Ἰησοῦ ἐκεῖ&lt;/p&gt;
&lt;p&gt;[343415] John 2.1&lt;br /&gt;
&lt;strong&gt;Καὶ τῇ ἡμέρᾳ τῇ τρίτῃ γάμος ἐγένετο ἐν Κανὰ τῆς Γαλιλαίας καὶ ἦν ἡ μήτηρ τοῦ Ἰησοῦ ἐκεῖ&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343602] John 2.11&lt;br /&gt;
&lt;strong&gt;ταύτην ἐποίησεν ἀρχὴν τῶν σημείων ὁ Ἰησοῦς ἐν Κανὰ τῆς Γαλιλαίας&lt;/strong&gt; and ἐφανέρωσεν τὴν δόξαν αὐτοῦ and ἐπίστευσαν εἰς αὐτὸν οἱ μαθηταὶ αὐτοῦ&lt;/p&gt;
&lt;p&gt;[343612] John 2.11&lt;br /&gt;
&lt;strong&gt;ταύτην ἐποίησεν ἀρχὴν τῶν σημείων ὁ Ἰησοῦς ἐν Κανὰ τῆς Γαλιλαίας καὶ ἐφανέρωσεν τὴν δόξαν αὐτοῦ καὶ ἐπίστευσαν εἰς αὐτὸν οἱ μαθηταὶ αὐτοῦ&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343725] John 2.16&lt;br /&gt;
and τοῖς τὰς περιστερὰς πωλοῦσιν εἶπεν ἄρατε ταῦτα ἐντεῦθεν . &lt;strong&gt;μὴ ποιεῖτε τὸν οἶκον τοῦ πατρός μου οἶκον ἐμπορίου&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343492] John 2.5&lt;br /&gt;
λέγει ἡ μήτηρ αὐτοῦ τοῖς διακόνοις : &lt;strong&gt;ὅ τι ἂν λέγῃ ὑμῖν ποιήσατε&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343668] John 2.14,2.15,2.16&lt;br /&gt;
&lt;strong&gt;καὶ εὗρεν ἐν τῷ ἱερῷ τοὺς πωλοῦντας βόας καὶ πρόβατα καὶ περιστερὰς καὶ τοὺς κερματιστὰς καθημένους&lt;/strong&gt; and, having made a whip out of ropes , he threw out of the temple all the sheep and the oxen and τῶν κολλυβιστῶν ἐξέχεεν τὸ κέρμα and τὰς τραπέζας ἀνέστρεψεν and τοῖς τὰς περιστερὰς πωλοῦσιν εἶπεν ἄρατε ταῦτα ἐντεῦθεν . μὴ ποιεῖτε τὸν οἶκον τοῦ πατρός μου οἶκον ἐμπορίου&lt;/p&gt;
&lt;p&gt;[343690] John 2.14,2.15,2.16&lt;br /&gt;
καὶ εὗρεν ἐν τῷ ἱερῷ τοὺς πωλοῦντας βόας καὶ πρόβατα καὶ περιστερὰς καὶ τοὺς κερματιστὰς καθημένους and, &lt;strong&gt;ποιήσας φραγέλλιον ἐκ σχοινίων πάντας ἐξέβαλεν ἐκ τοῦ ἱεροῦ τά τε πρόβατα καὶ τοὺς βόας&lt;/strong&gt; and τῶν κολλυβιστῶν ἐξέχεεν τὸ κέρμα and τὰς τραπέζας ἀνέστρεψεν and τοῖς τὰς περιστερὰς πωλοῦσιν εἶπεν ἄρατε ταῦτα ἐντεῦθεν . μὴ ποιεῖτε τὸν οἶκον τοῦ πατρός μου οἶκον ἐμπορίου&lt;/p&gt;
&lt;p&gt;[343684] John 2.14,2.15,2.16&lt;br /&gt;
καὶ εὗρεν ἐν τῷ ἱερῷ τοὺς πωλοῦντας βόας καὶ πρόβατα καὶ περιστερὰς καὶ τοὺς κερματιστὰς καθημένους and, ποιήσας φραγέλλιον ἐκ σχοινίων πάντας ἐξέβαλεν ἐκ τοῦ ἱεροῦ τά τε πρόβατα καὶ τοὺς βόας and τῶν κολλυβιστῶν ἐξέχεεν τὸ κέρμα and τὰς τραπέζας ἀνέστρεψεν and τοῖς τὰς περιστερὰς πωλοῦσιν εἶπεν ἄρατε ταῦτα ἐντεῦθεν . μὴ ποιεῖτε τὸν οἶκον τοῦ πατρός μου οἶκον ἐμπορίου&lt;/p&gt;
&lt;p&gt;[343498] John 2.6&lt;br /&gt;
there were there, standing according to the purification (rites) of the Jews , &lt;strong&gt;λίθιναι ὑδρίαι ἓξ χωροῦσαι ἀνὰ μετρητὰς δύο ἢ τρεῖς&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343494] John 2.6&lt;br /&gt;
&lt;strong&gt;ἦσαν δὲ ἐκεῖ λίθιναι ὑδρίαι ἓξ κατὰ τὸν καθαρισμὸν τῶν Ἰουδαίων κείμεναι χωροῦσαι ἀνὰ μετρητὰς δύο ἢ τρεῖς&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;[343857] John 2.23&lt;br /&gt;
&lt;strong&gt;Ὡς δὲ ἦν ἐν τοῖς Ἱεροσολύμοις ἐν τῷ πάσχα ἐν τῇ ἑορτῇ πολλοὶ ἐπίστευσαν εἰς τὸ ὄνομα αὐτοῦ θεωροῦντες αὐτοῦ τὰ σημεῖα ἃ ἐποίει&lt;/strong&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from April 25, 2010.</summary>
  </entry><entry>
    <title type="html">All Subtrees Not Just Clauses</title>
    <link href="https://jktauber.com/2010/04/14/all-subtrees-not-just-clauses/" rel="alternate" type="text/html" title="All Subtrees Not Just Clauses"/>
    <published>2010-04-14</published>
    <updated>2010-04-14</updated>
    <id>https://jktauber.com/2010/04/14/all-subtrees-not-just-clauses</id>
    <content type="html" xml:base="https://jktauber.com/2010/04/14/all-subtrees-not-just-clauses/">&lt;p&gt;A post to the graded-reader mailing list from April 14, 2010.&lt;/p&gt;
&lt;p&gt;I just ran a quick experiment where I treated the targets to learn not just as the clauses but any subtree in the dependency tree that has more than one word.&lt;/p&gt;
&lt;p&gt;This results in 8209 targets in John&#39;s gospel instead of 3206.&lt;/p&gt;
&lt;p&gt;Obviously it means learning common noun phrases and prepositional phrases first.&lt;/p&gt;
&lt;p&gt;In particular, these are the first things learnt when using the next-best algorithm:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;ὁ Ἰησοῦς
ἐν αὐτῷ
τοῦ θεοῦ
ἐκ θεοῦ
λέγει αὐτῷ
λέγει αὐτῷ Ἰησοῦς
λέγει αὐτῷ ὁ Ἰησοῦς
καὶ λέγει αὐτῷ
εἰς αὐτόν
πρὸς αὐτόν
τὸν πατέρα
πρὸς τὸν πατέρα
τὸν πατέρα μου
καὶ τὸν πατέρα μου
ἐν αὐτοῖς
λέγει αὐτοῖς
καὶ λέγει αὐτοῖς
λέγει αὐτοῖς ὁ Ἰησοῦς
εἶπεν αὐτῷ
καὶ εἶπεν ὁ Ἰησοῦς
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Compare this with the first things learnt when the targets are clauses only (i.e. only subtrees rooted on &#34;pred&#34;):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;εἶπεν
εἶπεν αὐτῷ
ἀπεκρίθη Ἰησοῦς
ἀπεκρίθη αὐτῷ Ἰησοῦς
ἀπεκρίθη Ἰησοῦς αὐτῷ
λέγει
λέγει αὐτῷ
λέγει αὐτῷ Ἰησοῦς
λέγει αὐτῷ ὁ Ἰησοῦς
εἶπεν αὐτῷ ὁ Ἰησοῦς
ἀπεκρίθη ὁ Ἰησοῦς
λέγει αὐτοῖς
λέγει αὐτοῖς Ἰησοῦς
λέγει αὐτοῖς ὁ Ἰησοῦς
εἶπεν αὐτοῖς
ἀπεκρίθη αὐτοῖς
ἀπεκρίθη αὐτοῖς Ἰησοῦς
ἀπεκρίθη αὐτοῖς ὁ Ἰησοῦς
εἶπεν αὐτοῖς Ἰησοῦς
εἶπεν αὐτοῖς ὁ Ἰησοῦς
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;(note, these are just based on surface form in text with no reference to any other linguistic information)&lt;/p&gt;
&lt;p&gt;While it&#39;s kind of nice seeing the noun phrases emerge in the first list, I worry about learning prepositional phrases in isolation from their verb. Thoughts? Of course, when combined with inline replacement into English, the verb &lt;em&gt;will&lt;/em&gt; be shown, albeit in English.&lt;/p&gt;
&lt;p&gt;I also realise now, the former list should include one-word subtrees if the word is a &#34;pred&#34;.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from April 14, 2010.</summary>
  </entry><entry>
    <title type="html">Initial Code Based on PROIEL Dependency Analysis</title>
    <link href="https://jktauber.com/2010/04/12/initial-code-based-proiel-dependency-analysis/" rel="alternate" type="text/html" title="Initial Code Based on PROIEL Dependency Analysis"/>
    <published>2010-04-12</published>
    <updated>2010-04-12</updated>
    <id>https://jktauber.com/2010/04/12/initial-code-based-proiel-dependency-analysis</id>
    <content type="html" xml:base="https://jktauber.com/2010/04/12/initial-code-based-proiel-dependency-analysis/">&lt;p&gt;A post to the graded-reader mailing list from April 12, 2010.&lt;/p&gt;
&lt;p&gt;Until this weekend, all the GNT graded reader work I&#39;d done has used clause boundaries from OpenText.org.&lt;/p&gt;
&lt;p&gt;With the availability of the PROIEL dependency tree analysis, I thought I&#39;d give that a go.&lt;/p&gt;
&lt;p&gt;I&#39;ve uploaded to github code for extracting the clauses in John&#39;s Gospel and generating a very basic reading programme from that.&lt;/p&gt;
&lt;p&gt;Clauses were extracted by looking at any &#39;pred&#39; arc and linearizing all nodes from that point down. If there were embedded preds then clauses corresponding to both inner and outer preds were generated.&lt;/p&gt;
&lt;p&gt;Note that the current code is just based on forms with use made of syntactic or morphological information. I also can&#39;t do inline replacement into an English context because I don&#39;t have an English text mapped to the PROIEL analysis.&lt;/p&gt;
&lt;p&gt;However, my initial impression is that the PROIEL analysis will be preferable to work with moving forward.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;Then Patrick Narkinsky asked:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Could you clarify in what ways you see the PROIEL data being superior to the opentext data?  One obvious one that leaps to mind is that OpenText seems to be a dead project...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr /&gt;
&lt;p&gt;It&#39;s actively maintained, is redistributable under a CC license, is based on a freely redistributable text and is a less idiosyncratic analysis.&lt;/p&gt;
&lt;p&gt;Admittedly, I haven&#39;t spent THAT much time with it but it seems that it will be easier to extract the kind of syntactic information I&#39;m interested in from it.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from April 12, 2010.</summary>
  </entry><entry>
    <title type="html">My BibleTech 2010 Talk</title>
    <link href="https://jktauber.com/2010/03/28/my-bibletech-2010-talk/" rel="alternate" type="text/html" title="My BibleTech 2010 Talk"/>
    <published>2010-03-28</published>
    <updated>2010-03-28</updated>
    <id>https://jktauber.com/2010/03/28/my-bibletech-2010-talk</id>
    <content type="html" xml:base="https://jktauber.com/2010/03/28/my-bibletech-2010-talk/">&lt;p&gt;Yesterday I gave a talk on the graded reader ideas at BibleTech.&lt;/p&gt;
&lt;p&gt;Here is a video of my talk.&lt;/p&gt;
&lt;iframe src=&#34;https://player.vimeo.com/video/10489590&#34; width=&#34;500&#34; height=&#34;283&#34; frameborder=&#34;0&#34; webkitallowfullscreen mozallowfullscreen allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;The abstract:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We will discuss a new approach to language learning based on texts, with a special focus on learning Greek from the New Testament.&lt;/p&gt;
&lt;p&gt;We will be covering how various linguistic analyses of a text such as the Greek New Testament can help determine the order in which vocabulary and grammar is introduced and how each new word or grammatical concept can be shown in the context of the text.&lt;/p&gt;
&lt;p&gt;Lastly, we will also discuss various algorithms that have been implemented as well as open source Python code for producing this new kind of graded reader.&lt;/p&gt;
&lt;/blockquote&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Yesterday I gave a talk on the graded reader ideas at BibleTech.</summary>
  </entry><entry>
    <title type="html">The “Next-Best” Algorithm</title>
    <link href="https://jktauber.com/2008/04/01/next-best-algorithm/" rel="alternate" type="text/html" title="The “Next-Best” Algorithm"/>
    <published>2008-04-01</published>
    <updated>2008-04-01</updated>
    <id>https://jktauber.com/2008/04/01/next-best-algorithm</id>
    <content type="html" xml:base="https://jktauber.com/2008/04/01/next-best-algorithm/">&lt;p&gt;A post to the graded-reader mailing list from April 1, 2008.&lt;/p&gt;
&lt;p&gt;In the last few posts, I&#39;ve mentioned a simple algorithm I&#39;ve used (one of a number) for ordering items.&lt;/p&gt;
&lt;h3&gt;The Input&lt;/h3&gt;
&lt;p&gt;This algorithm, like all the ordering algorithms I&#39;ve tried takes as an input, a list of target-item pairs. For example,&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;T1 I1
T1 I3
T1 I7
T2 I2
T2 I7
T3 I4
...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;means that to read T1, you need to know I1, I3, I7; to read T2, you need to know I2, I7 and so on.&lt;/p&gt;
&lt;p&gt;The targets and items can be anything. For the various stats I&#39;ve posted here I&#39;ve used verses for the targets and either lemmas or inflected forms for the items. In the sample reader online, I use clauses as the targets and a combination of lemmas, inflected forms and a little bit of morphology (not much yet). If you want to model the fact that students can&#39;t read a target until they&#39;ve learnt some syntactic point or even some cultural point, that can be modeled by including an appropriate item for this.&lt;/p&gt;
&lt;p&gt;I make this point to emphasize that the ordering algorithm is independent of what we chose as targets and what items we include as prerequisites to being able to comprehend those targets.&lt;/p&gt;
&lt;h3&gt;The Output&lt;/h3&gt;
&lt;p&gt;What this (and my other algorithms) output is what I sometimes in comments and elsewhere refer to as a &#34;learning programme&#34;. (yes, I tend to use that spelling when referring to any ordered list to be followed that isn&#39;t a computer program)&lt;/p&gt;
&lt;p&gt;Such a programme looks like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;learn I2
learn I5
learn I7
know T2
learn I1
learn I3
know T1
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note that this algorithm will sometimes (as it does in the example above) prematurely mention an item that could be delayed (in this case I5) so the optimize-order code I&#39;ve mentioned previously and uploaded to Google Code is useful as a post-processing step.&lt;/p&gt;
&lt;h3&gt;The Algorithm Itself&lt;/h3&gt;
&lt;p&gt;The algorithm is very simple and follows an iterative process. At each step, each item not yet learnt is assigned a score. The item with the highest score is then learnt and the process repeats (with the scores being recalculated each time on the remaining items).&lt;/p&gt;
&lt;p&gt;The score favours items that are the only remaining unlearnt item (or one of only a few remaining unlearnt items) in a lot of different targets.&lt;/p&gt;
&lt;p&gt;At each step, each unlearnt item receives, for each target the item is a prerequisite for, an additional score of 1 / 2^num_unlearnt_items_in_target.&lt;/p&gt;
&lt;p&gt;In other words, for each target the item is the only unlearnt item in, the score goes up by 1/2, for each target the item is one of two unlearnt items in, the score goes up by 1/4, for each target the item is one of three unlearnt items in, the score goes up by 1/8 and so on.&lt;/p&gt;
&lt;p&gt;I haven&#39;t done much experimentation to see if this exponential decay is optimal but it seems to give good results.&lt;/p&gt;
&lt;p&gt;Because this algorithm is iterative and picks a single item at each step rather than exploring multiple ordering possibilities, I&#39;m tentatively calling this algorithm the &#34;next-best&#34; algorithm.&lt;/p&gt;
&lt;p&gt;I&#39;ve checked in the code as &lt;a href=&#34;http://code.google.com/p/graded-reader/source/browse/trunk/code/next-best.py&#34;&gt;http://code.google.com/p/graded-reader/source/browse/trunk/code/next-best.py&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;It is important to note that this algorithm currently considers all items equally easy (or difficult!) to learn and assumes they are independent. However, it would be relatively easy to augment the algorithm with difficulty weightings and I plan to do that soon.&lt;/p&gt;
&lt;p&gt;Another feature that I&#39;m considering is being able to &#34;pin down&#34; certain items as not being available until a particular point. You may, for example, want to delay the introduction of participles but otherwise have the algorithm come up with its own ordering.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from April 1, 2008.</summary>
  </entry><entry>
    <title type="html">Vocab Coverage Table for a Better Ordering</title>
    <link href="https://jktauber.com/2008/03/29/vocab-coverage-table-better-ordering/" rel="alternate" type="text/html" title="Vocab Coverage Table for a Better Ordering"/>
    <published>2008-03-29T10:43:00</published>
    <updated>2008-03-29T10:43:00</updated>
    <id>https://jktauber.com/2008/03/29/vocab-coverage-table-better-ordering</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/29/vocab-coverage-table-better-ordering/">&lt;p&gt;A post to the graded-reader mailing list from March 29, 2008.&lt;/p&gt;
&lt;p&gt;I thought I&#39;d calculate the vocabulary coverage table assuming the   ordering generated for the post  &#34;just how much can frequency ordering be improved on?&#34;. To do this, I modified vocab-coverage.py to load in an arbitrary learning programme instead of assuming a frequency ordering. The code is now checked in as &lt;code&gt;vocab-coverage-arbitrary.py&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Here&#39;s the original frequency ordering of forms in the Greek NT (using counts rather than percentages in the cells):&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;                   0%     50%     75%     90%     95%     100%

          100    7928    4585      88       1       0        0
          200    7931    6291     515      26       4        4
          500    7935    7388    2149     182      46       39
         1000    7937    7700    4085     631     184      141
         2000    7938    7838    5765    1736     628      456
         5000    7939    7920    7232    4161    2275     1711
         8000    7939    7935    7684    5691    3784     3004
        12000    7941    7939    7879    6858    5149     4310
        16000    7941    7941    7937    7777    7060     6549
        20000    7941    7941    7941    7941    7941     7941
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And here&#39;s the table with the ordered produced in the &#34;just how much&lt;br /&gt;
can frequency ordering be improved on?&#34; post:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;                   0%     50%     75%     90%       95%      100%

          100    7896    1762      78     *37*      *36*      *36*
          200    7927    4590     339     *81*      *71*      *70*
          500    7933    6781    1572    *315*     *225*     *213*
         1000    7935    7455    3155    *802*     *526*     *491*
         2000    7936    7739    4872   *1820*    *1242*    *1144*
         5000    7939    7869    6400    3592     *3246*    *3244*
         8000    7939    7908    7156    5071     *4745*    *4742*
        12000    7939    7924    7501    6501     *6463*    *6463*
        16000    7940    7933    7791    7646     *7645*    *7645*
        20000    7941    7941    7941    7941      7941      7941
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I&#39;ve marked with asterisks those instances where the number is better than the frequency ordering.&lt;/p&gt;
&lt;p&gt;Note that because the ordering algorithm was highly biased towards reading entire verses, it is actually worse for coverage 75th or below. Even for 90% it&#39;s only better for the first 2000 items.&lt;/p&gt;
&lt;p&gt;But for the 100% familiarity level, you can see just how much better even the simple algorithm I used (which I will explain shortly) is than frequency ordering. For 200 forms, you get 70 verses instead of 4!&lt;/p&gt;
&lt;p&gt;I&#39;ll repeat the caveats I mentioned in the other post, though:  items are considered independent and equally  easy to learn, there&#39;s no consideration of morphology, syntax, idiom and this is using verses as targets. We&#39;ll fix all that over time.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from March 29, 2008.</summary>
  </entry><entry>
    <title type="html">Ordering is Ultimately of Targets not Items</title>
    <link href="https://jktauber.com/2008/03/29/ordering-ultimately-targets-not-items/" rel="alternate" type="text/html" title="Ordering is Ultimately of Targets not Items"/>
    <published>2008-03-29T10:42:00</published>
    <updated>2008-03-29T10:42:00</updated>
    <id>https://jktauber.com/2008/03/29/ordering-ultimately-targets-not-items</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/29/ordering-ultimately-targets-not-items/">&lt;p&gt;A post to the graded-reader mailing list from March 29, 2008.&lt;/p&gt;
&lt;p&gt;[this is based on a blog post from August 2005 but with the  terminology changed]&lt;/p&gt;
&lt;p&gt;Say you have written a program which lists an order in which to learn items along with an indication, every so often, of what new target has been reached. Running on the Greek lexemes of 1John, you might get something starting like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;learn μαρτυρέω
learn θεός
learn ἐν
learn εἰμί
learn ὁ
learn τρεῖς
learn ὅτι
know 230507
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This gives seven items to learn and then a target that has been reached (230507 = 1John 5.7). The problem is that two of those items are unnecessary. You only need to learn μαρτυρέω, εἰμί, ὁ, τρεῖς and ὅτι to be able to read 1John 5.7.&lt;/p&gt;
&lt;p&gt;The problem is that the program is ordering items first and only then establishing at each point what goals (if any) have been achieved.&lt;/p&gt;
&lt;p&gt;What you really want to do is not display an item until it is needed. So back in 2005, I wrote some code that optimizes the ordering of items by delaying any that are not yet needed.&lt;/p&gt;
&lt;p&gt;I&#39;ve now made that code more generic and will check it in shortly.&lt;/p&gt;
&lt;p&gt;It can be used as a post-processor on ordering from any source, even a manually crafted list of items. It will optimize the ordering of items for the same ordering of targets.&lt;/p&gt;
&lt;p&gt;Because the algorithm for doing such an optimization is nearly identical to what&#39;s necessary to calculate the &#34;area under the curve&#34; that I described in my video (and will write more about soon) my new code also outputs a score.&lt;/p&gt;
&lt;p&gt;I&#39;ll be checking it in shortly.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;It&#39;s available at:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;http://code.google.com/p/graded-reader/source/browse/trunk/code/optimize-order.py&#34;&gt;http://code.google.com/p/graded-reader/source/browse/trunk/code/optimize-order.py&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from March 29, 2008.</summary>
  </entry><entry>
    <title type="html">Just How Much Can Frequency Ordering Be improved On?</title>
    <link href="https://jktauber.com/2008/03/26/just-how-much-can-frequency-ordering-be-improved/" rel="alternate" type="text/html" title="Just How Much Can Frequency Ordering Be improved On?"/>
    <published>2008-03-26T10:40:00</published>
    <updated>2008-03-26T10:40:00</updated>
    <id>https://jktauber.com/2008/03/26/just-how-much-can-frequency-ordering-be-improved</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/26/just-how-much-can-frequency-ordering-be-improved/">&lt;p&gt;A post to the graded-reader mailing list from March 26, 2008.&lt;/p&gt;
&lt;p&gt;Here&#39;s a quick demonstration. Recall that in my previous post, I pointed out that learning the top 100 inflected forms gives you 0 (zero, nada) target versus in the GNT. I showed that, for example target 130528 (1 Thessalonians 5.28) gets excluded because of one form that is #235 while the other eight forms appear in the top 66.&lt;/p&gt;
&lt;p&gt;Well, what if those 9 forms were learnt first? That is:&lt;/p&gt;
&lt;p&gt;Χριστοῦ, κυρίου, Ἰησοῦ, ὑμῶν, μετά, τοῦ, χάρις, ἡ, ἡμῶν&lt;/p&gt;
&lt;p&gt;Not only could 130528 be read but also 071623&lt;/p&gt;
&lt;p&gt;Now if the reader learnt πάντων (just one more form) they could read three more verses: 140318, 191325 and 272221&lt;/p&gt;
&lt;p&gt;Now introduce these six forms:&lt;/p&gt;
&lt;p&gt;καί, ὑμῖν, ἀπό, εἰρήνη, πατρός, θεοῦ&lt;/p&gt;
&lt;p&gt;and suddenly &lt;em&gt;seven&lt;/em&gt; more verses are readable: 140102, 070103, 100102,  110102, 090103, 180103, 080102&lt;/p&gt;
&lt;p&gt;This was just with one algorithm I&#39;m experimenting with (which I&#39;ll explain and provide code for soon) and there are likely others than do better.&lt;/p&gt;
&lt;p&gt;So instead of 100 forms giving 0 verses, we now have just 16 forms giving us 12 entire verses from an actual corpus.&lt;/p&gt;
&lt;p&gt;The usual caveats apply: items are considered independent and equally easy to learn, there&#39;s no consideration of morphology, syntax, idiom&lt;br /&gt;
and this is using verses as targets. We&#39;ll fix all that over time.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from March 26, 2008.</summary>
  </entry><entry>
    <title type="html">If Only They Knew That One Rare Word...</title>
    <link href="https://jktauber.com/2008/03/26/if-only-they-knew-one-rare-word/" rel="alternate" type="text/html" title="If Only They Knew That One Rare Word..."/>
    <published>2008-03-26T10:41:00</published>
    <updated>2008-03-26T10:41:00</updated>
    <id>https://jktauber.com/2008/03/26/if-only-they-knew-one-rare-word</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/26/if-only-they-knew-one-rare-word/">&lt;p&gt;A post to the graded-reader mailing list from March 26, 2008.&lt;/p&gt;
&lt;p&gt;I&#39;m going to talk in more detail about alternatives to frequency order in a different thread but I wanted to share the results of a quite striking little test I did.&lt;/p&gt;
&lt;p&gt;In my last post, I show the vocab/coverage table applied to fully inflected forms in the Greek NT rather than lexemes. You may have noticed that the 100% coverage column and even the 95% coverage column said 0.0% verses for the 100 most frequent forms.&lt;/p&gt;
&lt;p&gt;If you did, you might then have wondered: is this just a rounding error? The answer is no. Even if you knew the 100 most frequent inflected forms in the GNT, there is not a single verse you would know all the forms in (of course assuming you couldn&#39;t guess).&lt;/p&gt;
&lt;p&gt;I wanted to test if this was because of just one outlier. So I modified  (added 4 extra lines) the code that produced the table to instead output a list of the top ten targets (i.e. verses) whose &lt;em&gt;second least&lt;/em&gt; frequent item (i.e. form) is most frequent overall.&lt;/p&gt;
&lt;p&gt;Here are the results:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;032030      2         [1, 2, 1077]
030146     35         [1, 35, 524]
041135     46         [2, 46, 14597]
130528     66         [5, 19, 38, 45, 49, 59, 65, 66, 235]
071623     66         [5, 19, 38, 45, 59, 66, 235]
070323     68         [3, 3, 29, 65, 68, 131]
020940     72         [8, 18, 22, 22, 44, 49, 49, 72, 102]
012425     78         [36, 78, 2846]
060211     96         [8, 14, 18, 22, 79, 96, 4276]
130519     98         [7, 17, 98, 14731]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What this listing is showing is that, for example, target 032030 (Luke 20.30) consists of the 1st, 2nd and 1077th most frequent forms; target 030146 (Luke 1.46) consists of the 1st, 35th and 524th most frequent forms. So if the rarest word wasn&#39;t needed, they would jump from needing the top 1077 forms to just the top 2 and from needing the top 524 forms to the top 35.&lt;/p&gt;
&lt;p&gt;Now you may argue that many of these are bad examples because the verse doesn&#39;t make sense in isolation (a good reason to be more careful about what to use as targets) or that the one rare word is actually the one carrying most of the semantic weight.&lt;/p&gt;
&lt;p&gt;But this little test demonstrates that sometimes a single rare item can massively delay reading an otherwise quite readable target unit.&lt;/p&gt;
&lt;p&gt;By the way, here&#39;s the same listing based on &lt;em&gt;lexemes&lt;/em&gt; rather than fully inflected forms:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;032030      2           [1, 2, 346]
030146      9           [2, 9, 509]
011615      9           [3, 4, 5, 7, 8, 9, 9, 33]
032448     13           [4, 13, 415]
090124     14           [1, 2, 6, 7, 14, 267]
021337     16           [4, 5, 9, 9, 12, 16, 588]
040620     17           [1, 3, 5, 7, 8, 9, 17, 180]
041135     19           [1, 19, 4752]
040426     19           [1, 1, 3, 4, 7, 8, 9, 19, 56]
031934     24           [1, 1, 3, 5, 9, 15, 23, 24, 311]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I&#39;ll check in the code that produces this shortly.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;It&#39;s now available at&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;http://code.google.com/p/graded-reader/source/browse/trunk/code/if-only.py&#34;&gt;http://code.google.com/p/graded-reader/source/browse/trunk/code/if-only.py&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from March 26, 2008.</summary>
  </entry><entry>
    <title type="html">GNT Verse Coverage with Frequency Ordering</title>
    <link href="https://jktauber.com/2008/03/25/gnt-verse-coverage-frequency-ordering/" rel="alternate" type="text/html" title="GNT Verse Coverage with Frequency Ordering"/>
    <published>2008-03-25</published>
    <updated>2008-03-25</updated>
    <id>https://jktauber.com/2008/03/25/gnt-verse-coverage-frequency-ordering</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/25/gnt-verse-coverage-frequency-ordering/">&lt;p&gt;A post to the graded-reader mailing list from March 25, 2008.&lt;/p&gt;
&lt;p&gt;[if you&#39;ll indulge me, I&#39;m trying to get all my thoughts and previous writing on these topics in one place and this list is a good place to do it]&lt;/p&gt;
&lt;p&gt;[this is based on a post to b-greek[1] and my blog[2]. I hope the table comes out! ]&lt;/p&gt;
&lt;p&gt;It is fairly common, in the context of learning vocabulary for a particular corpus like the Greek New Testament, to talk about what proportion of the text one could read if one learnt the top N words. I even produced such a table for the GNT back in 1996—see New Testament Vocabulary Count Statistics[3].&lt;/p&gt;
&lt;p&gt;But these sort of numbers are highly misleading because they don&#39;t tell you what proportion of sentences (or as a rough proxy in the GNT case: verses) you could read, only what proportion of words.&lt;/p&gt;
&lt;p&gt;Reading theorists have suggested that you need to know 95% of the vocabulary of a sentence to comprehend it. So a more interesting list of statistics would be how many verses can one understand 95% of the vocab of if one know a certain number of words. Of course, there&#39;s a lot more to reading comprehension than knowing the vocab. But it was enough for me to decide to write some code yesterday afternoon to run against my MorphGNT database.&lt;/p&gt;
&lt;p&gt;To first of all give you a flavour in the specific before moving to the final numbers, consider John 3.16, which is, from a vocabulary point of view, a very easy verse to read.&lt;/p&gt;
&lt;p&gt;To be able to read 50% of it, you only need to know the top 28 lexemes in the GNT. To read 75% you only need the top 85 (up to κόσμος). With the top 204 lexemes, you can read 90% of the verse and only a few more: up to 236 (αἰώνιος) gives you the 95%. The only word you would not have come across learning the top 236 words would be μονογενής but even that is in the top 1,200.&lt;/p&gt;
&lt;p&gt;This example does highlight some of the shortcomings of this sort of analysis. There&#39;s no consideration of necessary knowledge of morphology, syntax, idioms, etc. Nor for the fact that the meaning of something like μονογενής is fairly easy to guess from knowledge of more common words. But I still think it&#39;s much more useful than the pure word coverage statistics I linked to above.&lt;/p&gt;
&lt;p&gt;So let&#39;s actually run the numbers on the complete GNT. If you know the top N words, how many verses could you understand 50% of, 75%, 90% or 95% of...&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;vocab / coverage    any      50%        75%      90%      95%     100%

100                 99.9%    91.3%    24.4%     2.1%     0.6%     0.4%
200                 99.9%    96.9%    51.8%     9.8%     3.4%     2.5%
500                 99.9%    99.1%    82.3%    36.5%    18.0%    13.9%
1,000              100.0%    99.7%    93.6%    62.3%    37.3%    30.1%
1,500              100.0%    99.8%    97.2%    76.3%    53.5%    44.8%
2,000              100.0%    99.9%    98.4%    85.1%    65.5%    56.5%
3,000              100.0%   100.0%    99.4%    93.6%    81.0%    74.1%
4,000              100.0%   100.0%    99.7%    97.4%    90.0%    85.5%
5,000              100.0%   100.0%   100.0%    99.4%    96.5%    94.5%
all                100.0%   100.0%   100.0%   100.0%   100.0%   100.0%
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What this means is purely from a vocabulary point of view if you knew the top 1000 lexemes, then 37.3% of verses in the GNT would be 95% familiar to you.&lt;/p&gt;
&lt;p&gt;Note that this uses:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;verses as the reading target&lt;/li&gt;
&lt;li&gt;lexemes as the individual items to be learnt&lt;/li&gt;
&lt;li&gt;frequency of lexemes as the ordering&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It is possible to alter any of these variables and in subsequent posts I will do this.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;
&lt;p&gt;[1] &lt;a href=&#34;http://lists.ibiblio.org/pipermail/b-greek/2007-November/044685.html&#34;&gt;http://lists.ibiblio.org/pipermail/b-greek/2007-November/044685.html&lt;/a&gt;&lt;br /&gt;
[2] &lt;a href=&#34;http://jtauber.com/blog/2007/11/04/gnt_verse_coverage_statistics/&#34;&gt;http://jtauber.com/blog/2007/11/04/gnt_verse_coverage_statistics/&lt;/a&gt;&lt;br /&gt;
[3] (via Internet Archive&#39;s Wayback Machine) &lt;a href=&#34;http://web.archive.org/web/19961104033056/www.entmp.org/HGrk/grammar/lexicon/NTcount.shtml&#34;&gt;http://web.archive.org/web/19961104033056/www.entmp.org/HGrk/grammar/lexicon/NTcount.shtml&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;I&#39;ve checked in my Python code as: &lt;a href=&#34;http://code.google.com/p/graded-reader/source/browse/trunk/code/vocab-coverage.py&#34;&gt;http://code.google.com/p/graded-reader/source/browse/trunk/code/vocab-coverage.py&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;If you&#39;re not comfortable running it yourself, I can run it on any data you provide.&lt;/p&gt;
&lt;p&gt;(if you send data, I suggest you do it off-list and be careful because a &#34;reply&#34; will go to the entire mailing list)&lt;/p&gt;
&lt;p&gt;Remember that, as I said in my post, there&#39;s no consideration of necessary knowledge of morphology, syntax, idioms, etc. Over time, we can incorporate that, but for now the results are limited to the somewhat naïve assumptions that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;comprehension is only at the level of the target (the verse in my example data)&lt;/li&gt;
&lt;li&gt;learning the items (lexemes in the example table I gave) is all that matters to comprehending the target&lt;/li&gt;
&lt;li&gt;all items are equally easy to learn&lt;/li&gt;
&lt;li&gt;there is no dependency between items&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;and, of course, the table assumes a frequency ordering of items. Soon I&#39;ll be starting a separate thread on alternative orderings.&lt;/p&gt;
&lt;p&gt;But all that said, the numbers produced are far more useful than misleading notions like &#34;the top 10 words account for 37% of the text&#34;.&lt;/p&gt;
&lt;p&gt;Incidentally, here is the table when applied to &lt;em&gt;forms&lt;/em&gt; in the Greek NT rather than lexemes:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;                0%       50%       75%       90%       95%      100%

   100       99.8%     57.7%      1.1%      0.0%      0.0%      0.0%
   200       99.8%     79.2%      6.4%      0.3%      0.0%      0.0%
   500       99.9%     93.0%     27.0%      2.2%      0.5%      0.4%
 1,000       99.9%     96.9%     51.4%      7.9%      2.3%      1.7%
 2,000       99.9%     98.7%     72.5%     21.8%      7.9%      5.7%
 5,000       99.9%     99.7%     91.0%     52.3%     28.6%     21.5%
 8,000       99.9%     99.9%     96.7%     71.6%     47.6%     37.8%
12,000      100.0%     99.9%     99.2%     86.3%     64.8%     54.2%
16,000      100.0%    100.0%     99.9%     97.9%     88.9%     82.4%
20,000      100.0%    100.0%    100.0%    100.0%    100.0%    100.0%
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The fact that it takes 1,000 forms just to get 2.3% of verses at 95% coverage is indicative of the fact that frequency alone is not the way&lt;br /&gt;
to go. Soon, I&#39;ll also produce similar tables using clauses (in the OpenText.org sense), rather than verses, as the target.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from March 25, 2008.</summary>
  </entry><entry>
    <title type="html">Welcome (and some files)</title>
    <link href="https://jktauber.com/2008/03/23/welcome-and-some-files/" rel="alternate" type="text/html" title="Welcome (and some files)"/>
    <published>2008-03-23T10:36:00</published>
    <updated>2008-03-23T10:36:00</updated>
    <id>https://jktauber.com/2008/03/23/welcome-and-some-files</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/23/welcome-and-some-files/">&lt;p&gt;A post to the graded-reader mailing list from March 23, 2008.&lt;/p&gt;
&lt;p&gt;Welcome to the graded-reader mailing list.&lt;/p&gt;
&lt;p&gt;I&#39;ve been getting a lot of queries in response to my presentation so I thought I&#39;d start a mailing list so we can all discuss questions and issues together.&lt;/p&gt;
&lt;p&gt;I also plan to make available the code that I&#39;m using to produce the graded reader. Because it&#39;s closely tied to the particular text and linguistic data I&#39;m currently dealing with, it will take some time to make generic but I plan to release stuff incrementally based on your feedback.&lt;/p&gt;
&lt;p&gt;I want to spend some time going through my current approach and explaining the different components and the ideas behind them. For the most part, these ideas can be used independently of one another so if you don&#39;t like one aspect of what I&#39;ve done, you can still make use of other aspects. Also I&#39;m still improving things in lots of different ways and, of course, I look forward to a lot of new ideas coming from this list.&lt;/p&gt;
&lt;p&gt;Because the video presentation actually doesn&#39;t show much in terms of results, I&#39;ve uploaded two files that will give you a flavour of the current state of my work.&lt;/p&gt;
&lt;p&gt;You can get to these files at &lt;a href=&#34;http://groups.google.com/group/graded-reader&#34;&gt;http://groups.google.com/group/graded-reader&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;example-reader.html&lt;/code&gt; shows the first 50 word forms output by the current version of my software when run on the Greek text of John&#39;s gospel.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;greek_2.pdf&lt;/code&gt; shows lesson 2 of an informal course I&#39;m running for a couple of friends which uses the graded reader approach.&lt;/p&gt;
&lt;p&gt;You&#39;ll notice (1) there is a lot of extra information in the lesson given to students; (2) the order in which words are presented is different.&lt;/p&gt;
&lt;p&gt;There are three reasons for the difference in order:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;the ordering in lesson 2 was hand tweaked from what the software originally produced&lt;/li&gt;
&lt;li&gt;the lesson 2 ordering was produced by an earlier version of the ordering algorithm that what was used for example-reader.html&lt;/li&gt;
&lt;li&gt;example-reader.html used slightly more linguistic information (in particular, it knew about some verb endings) in the generation of ordering&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Note that the goal is to eventually not do any tweaking, but rather to capture in both the software and input data the criteria that motivated the manual reordering in the first place.&lt;/p&gt;
&lt;p&gt;I&#39;ll send separate posts discussing different aspects of what goes in to producing the automated output.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from March 23, 2008.</summary>
  </entry><entry>
    <title type="html">Throttle and Delay</title>
    <link href="https://jktauber.com/2008/03/23/throttle-and-delay/" rel="alternate" type="text/html" title="Throttle and Delay"/>
    <published>2008-03-23T10:38:00</published>
    <updated>2008-03-23T10:38:00</updated>
    <id>https://jktauber.com/2008/03/23/throttle-and-delay</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/23/throttle-and-delay/">&lt;p&gt;A post to the graded-reader mailing list from March 23, 2008.&lt;/p&gt;
&lt;p&gt;When you look at example-reader.html[1] you see that as well as the normal verse pairs, there are pairs marked REVIEW.&lt;/p&gt;
&lt;p&gt;This is another idea I&#39;m experimenting with that is independent of other ordering and display choices.&lt;/p&gt;
&lt;p&gt;Basically, when a particular clause such as καὶ εἶπεν is introduced, I never repeat more than 3 instances of it. Instead I store up any additional instances to show later as reminders.&lt;/p&gt;
&lt;p&gt;This &#34;throttle-and-delay&#34; technique is a separate part of the overall pipeline that produces the text.&lt;/p&gt;
&lt;p&gt;The ordering algorithm, before the throttle-and-delay produces something like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;NT.John.18_c108
NT.John.20_c122
NT.John.11_c131
NT.John.9_c174
NT.John.3_c117
NT.John.12_c121
NT.John.12_c178
NT.John.6_c97
NT.John.7_c53
NT.John.13_c95
NT.John.11_c161
NT.John.21_c114
NT.John.3_c50
NT.John.9_c25
NT.John.3_c12
NT.John.4_c71
NT.John.13_c27
NT.John.1_c206
NT.John.3_c46
NT.John.3_c4
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and then the penultimate step is taking this and turning it in to the following. I&#39;ll explain later what the various parts of the &#34;learn&#34; lines are (I&#39;m adding to them all the time), but for now the thing to note is that know_S means &#34;show this new clause they now know&#34;, know_A means &#34;they know this clause at this point but don&#39;t show it yet&#34; and know_R means &#34;show the previously introduced clause that was delayed due to throttling&#34;&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;learn καί|καί|C-|---|-----|-
learn εἶπε(ν)|λέγω|V-|AAI|3-S--|-ε(ν):sa3S
know_S NT.John.3_c117
know_S NT.John.6_c97
know_S NT.John.7_c53
know_A NT.John.9_c174
know_A NT.John.11_c131
know_A NT.John.11_c161
know_A NT.John.12_c121
know_A NT.John.12_c178
know_A NT.John.13_c95
know_A NT.John.18_c108
know_A NT.John.20_c122
know_A NT.John.21_c114
learn αὐτῷ|αὐτός|RP|---|-DSM-|-
know_S NT.John.1_c198
know_S NT.John.1_c206
know_S NT.John.3_c4
know_A NT.John.3_c12
know_A NT.John.3_c46
know_A NT.John.3_c50
know_A NT.John.4_c71
know_A NT.John.5_c54
know_A NT.John.9_c25
know_A NT.John.13_c27
know_A NT.John.14_c101
know_A NT.John.18_c142
know_A NT.John.20_c132
learn αὐτοῖς|αὐτός|RP|---|-DPM-|-
know_S NT.John.2_c64
know_S NT.John.6_c113
know_S NT.John.6_c174
know_A NT.John.7_c78
know_A NT.John.8_c24
know_A NT.John.8_c54
know_A NT.John.9_c147
know_A NT.John.13_c54
know_A NT.John.16_c78
know_R NT.John.4_c71
know_R NT.John.3_c46
know_R NT.John.5_c54
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is actually the input to the final stage that produces the HTML.&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;
&lt;p&gt;[1] linked from http://groups.google.com/group/graded-reader/files&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from March 23, 2008.</summary>
  </entry><entry>
    <title type="html">Embedding the Target Language in English</title>
    <link href="https://jktauber.com/2008/03/23/embedding-target-language-english/" rel="alternate" type="text/html" title="Embedding the Target Language in English"/>
    <published>2008-03-23T10:37:00</published>
    <updated>2008-03-23T10:37:00</updated>
    <id>https://jktauber.com/2008/03/23/embedding-target-language-english</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/23/embedding-target-language-english/">&lt;p&gt;A post to the graded-reader mailing list from March 23, 2008.&lt;/p&gt;
&lt;p&gt;[this will be a bit of an experiment as to whether I can cut and paste formatted Greek and have it pass through Google Groups. I apologize in advance if it doesn&#39;t work]&lt;/p&gt;
&lt;p&gt;One aspect of the reader that seems to have received a lot of interest is the embedding of the target language (in my case Greek) in English.&lt;/p&gt;
&lt;p&gt;It is important to note that this is entirely independent of the 95% of the code and data which has to do which choosing the order in which to learn things.&lt;/p&gt;
&lt;p&gt;I wanted to explain a little about how it&#39;s produced and what the variables are that could be tweaked or changed all together.&lt;/p&gt;
&lt;p&gt;First of all, consider the very first block of text introduced:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;John 3.26:
So they came to John and said to him, “Rabbi, the one who was with you on the other side of the Jordan River, about whom you testified – see, he is baptizing, and everyone is flocking to him!”
John 3.27:
John replied καὶ εἶπεν, “No one can receive anything unless it has been given to him from heaven.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;For those of you who don&#39;t know Greek, καὶ εἶπεν means &#34;and (he) said&#34;.&lt;/p&gt;
&lt;p&gt;This was generated because the ordering component of the software said that the first thing to be introduced is clause &lt;code&gt;NT.John.3_c117&lt;/code&gt;. That&#39;s a clause reference from OpenText.org&#39;s clause analysis of the New Testament. Part of my database is a listing of all the clauses, as identified by OpenText.org along with this unique identifier and what chapter/verse the clause comes from:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;NT.John.3_c117|3.27|καὶ εἶπεν,
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So my code knows that the clause to show is from John 3.27. I decided to always include the previous verse for context as well. So I retrieve John 3.26 and John 3.27 from a database containing the NET translation but annotated with the OpenText.org clause boundaries:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;3.26 [c108 So they came to John ] [c109 and said to him, ] “Rabbi, the one who was with you on the other side of the Jordan River, [c112 about whom you testified – ] [c113 see, ] [c114 he is baptizing, ] [c115 and everyone is flocking to him!” ]
3.27 [c116 John replied ] [c117 and said, ] “No one can receive anything unless it has been given to him from heaven.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Notice that I haven&#39;t annotated everything yet. It&#39;s a slow and laborious process so I tend to just mark clauses as they are needed.&lt;/p&gt;
&lt;p&gt;In some cases, I slightly alter the NET translation so there is something to annotate. This becomes challenging when NET has altered clause order and even more so when the Greek breaks apart words from the one clause that have to be together in the English. I still want to do more work in this area as the key thing to note is I never use the actual translation of the clause when introducing the clause; rather I use everything &lt;em&gt;except&lt;/em&gt; the translation of the clause and that might make the problem easier if thought about in those terms (rather than what my annotation above focuses on which is annotating what English text corresponds to what Greek clause).&lt;/p&gt;
&lt;p&gt;But this annotated NET is used to then produce what you see in the example-reader.html extra shown at the start. If other clauses were known at this point, they would be replaced by the Greek as well. Any clauses already known are show at normal weight and the new clause being introduced is shown in bold. Hence later on in example-reader.html (at step 13.)&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;John 4.49:
The official said to him, “Sir, come down before my child dies.”
John 4.50:
λέγει αὐτῷ ὁ Ἰησοῦς, “Go home; your son will live.” The man believed the word ὃν εἶπεν αὐτῷ ὁ Ἰησοῦς and set off for home.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So, to summarize: the input to this part of the process is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;what clause to introduce (by reference number)&lt;/li&gt;
&lt;li&gt;what verse this clause is in&lt;/li&gt;
&lt;li&gt;what other clauses are already known (by reference numbers)&lt;/li&gt;
&lt;li&gt;what the English text of the verse (from 2) and the one before are, annotated by clause references that can be replaced by Greek if known&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The variables to this particular step are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;the unit of text being introduced (in this example, a clause)&lt;/li&gt;
&lt;li&gt;the unit of text to show (in this example, the verse containing the clause and the verse before it)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There is no reason why the unit of text being introduced in Greek could not be smaller (a phrase or even a word) and the unit of text being shown in English larger (a paragraph, for example).&lt;/p&gt;
&lt;p&gt;Note that the clauses I am currently dealing with included embedded clauses such as relative clauses and so in the John 4.50 example, we have the relative clause ὃν εἶπεν αὐτῷ ὁ Ἰησοῦς (&#34;that Jesus said to him&#34;) even though it might have been better to wait until the containing noun phrase were readable (which would, of course, have required knowledge of phrase boundaries)&lt;/p&gt;
&lt;p&gt;James&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A post to the graded-reader mailing list from March 23, 2008.</summary>
  </entry><entry>
    <title type="html">Graded Reader Discussion and Code</title>
    <link href="https://jktauber.com/2008/03/22/graded-reader-discussion-and-code/" rel="alternate" type="text/html" title="Graded Reader Discussion and Code"/>
    <published>2008-03-22</published>
    <updated>2008-03-22</updated>
    <id>https://jktauber.com/2008/03/22/graded-reader-discussion-and-code</id>
    <content type="html" xml:base="https://jktauber.com/2008/03/22/graded-reader-discussion-and-code/">&lt;p&gt;Owing to the amount of interest I received about &lt;a href=&#34;/2008/02/10/new-kind-graded-reader/&#34;&gt;A New Kind of Graded Reader&lt;/a&gt;...&lt;/p&gt;
&lt;p&gt;I have started a mailing list at&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;http://groups.google.com/group/graded-reader&#34;&gt;http://groups.google.com/group/graded-reader&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;and also I plan to make my code available at&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;http://code.google.com/p/graded-reader/&#34;&gt;http://code.google.com/p/graded-reader/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;If you&#39;re interested in the idea applied to any language (not just NT Greek) please join us.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: The code has moved to GitHub: &lt;a href=&#34;https://github.com/jtauber/graded-reader&#34;&gt;https://github.com/jtauber/graded-reader&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Owing to the amount of interest I received about &lt;a href=&#34;/2008/02/10/new-kind-graded-reader/&#34;&gt;A New Kind of Graded Reader&lt;/a&gt;...</summary>
  </entry><entry>
    <title type="html">A New Kind of Graded Reader</title>
    <link href="https://jktauber.com/2008/02/10/new-kind-graded-reader/" rel="alternate" type="text/html" title="A New Kind of Graded Reader"/>
    <published>2008-02-10</published>
    <updated>2008-02-10</updated>
    <id>https://jktauber.com/2008/02/10/new-kind-graded-reader</id>
    <content type="html" xml:base="https://jktauber.com/2008/02/10/new-kind-graded-reader/">&lt;p&gt;Back in 2004, I talked about &lt;a href=&#34;/2004/11/26/programmed-vocabulary-learning-travelling-salesman/&#34;&gt;algorithms for optimal vocabulary ordering&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then in 2006, I talked about using this and other techniques in &lt;a href=&#34;http://jtauber.com/blog/2006/05/05/teaching_new_testament_greek/&#34;&gt;teaching New Testament Greek&lt;/a&gt; (which I&#39;ve resumed doing with this method, btw).&lt;/p&gt;
&lt;p&gt;Earlier this year at &lt;a href=&#34;/2008/01/14/bibletech-2008/&#34;&gt;BibleTech:2008&lt;/a&gt; I briefly touched on my graded reader approach. It generated a lot of interest so I decided to record a separate presentation at home this weekend, explaining some of the ideas behind the graded reader.&lt;/p&gt;
&lt;p&gt;After multiple failed attempts to upload it to Google Video, it&#39;s now on YouTube and embedded below. Sound was recorded and mixed in Logic Pro and then synchronized with a presentation in Keynote and output as Quicktime.&lt;/p&gt;
&lt;p&gt;Running time is just shy of 9 minutes.&lt;/p&gt;
&lt;iframe width=&#34;420&#34; height=&#34;315&#34; src=&#34;https://www.youtube.com/embed/ErmPyu19dgc&#34; frameborder=&#34;0&#34; allowfullscreen&gt;&lt;/iframe&gt;

&lt;p&gt;&lt;strong&gt;UPDATE 2008-03-22&lt;/strong&gt;: Now see &lt;a href=&#34;/2008/03/22/graded-reader-discussion-and-code/&#34;&gt;Graded Reader Discussion and Code&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Back in 2004, I talked about &lt;a href=&#34;/2004/11/26/programmed-vocabulary-learning-travelling-salesman/&#34;&gt;algorithms for optimal vocabulary ordering&lt;/a&gt;.</summary>
  </entry><entry>
    <title type="html">BibleTech 2008</title>
    <link href="https://jktauber.com/2008/01/14/bibletech-2008/" rel="alternate" type="text/html" title="BibleTech 2008"/>
    <published>2008-01-14</published>
    <updated>2008-01-14</updated>
    <id>https://jktauber.com/2008/01/14/bibletech-2008</id>
    <content type="html" xml:base="https://jktauber.com/2008/01/14/bibletech-2008/">&lt;p&gt;I don&#39;t think I&#39;ve mentioned it here before but next week, I&#39;m one of the keynote speakers at the &lt;a href=&#34;http://www.bibletechconference.com/&#34;&gt;BibleTech 2008&lt;/a&gt; conference in Seattle.&lt;/p&gt;
&lt;p&gt;While I&#39;ve given talks a number of times about my Greek linguistics research, this will be the first time that I&#39;ll get to talk about how I&#39;ve used technology in that research.&lt;/p&gt;
&lt;p&gt;I plan to give a history of the [MorphGNT] project and the various sub-projects I&#39;ve worked on over the last fifteen years, covering the evolution of data models, text encoding, tool sets and more. I then want to talk about the opportunities that lie ahead and where I hope the work will go in the future, particularly given my collaboration with Ulrik Sandborg-Petersen.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I don&#39;t think I&#39;ve mentioned it here before but next week, I&#39;m one of the keynote speakers at the &lt;a href=&#34;http://www.bibletechconference.com/&#34;&gt;BibleTech 2008&lt;/a&gt; conference in Seattle.</summary>
  </entry><entry>
    <title type="html">GNT Verse Coverage Statistics</title>
    <link href="https://jktauber.com/2007/11/04/gnt-verse-coverage-statistics/" rel="alternate" type="text/html" title="GNT Verse Coverage Statistics"/>
    <published>2007-11-04</published>
    <updated>2007-11-04</updated>
    <id>https://jktauber.com/2007/11/04/gnt-verse-coverage-statistics</id>
    <content type="html" xml:base="https://jktauber.com/2007/11/04/gnt-verse-coverage-statistics/">&lt;p&gt;It is fairly common, in the context of learning vocabulary for a particular corpus like the Greek New Testament, to talk about what proportion of the text one could read if one learnt the top N words.&lt;/p&gt;
&lt;p&gt;I even produced such a table for the GNT back in 1996—see &lt;a href=&#34;http://web.archive.org/web/19961104033056/www.entmp.org/HGrk/grammar/lexicon/NTcount.shtml&#34;&gt;New Testament Vocabulary Count Statistics&lt;/a&gt; (via Internet Archive&#39;s Wayback Machine).&lt;/p&gt;
&lt;p&gt;But these sort of numbers are highly  misleading because they don&#39;t tell you what proportion of sentences (or as a rough proxy in the GNT case: verses) you could read, only what proportion of words.&lt;/p&gt;
&lt;p&gt;Reading theorists have suggested that you need to know 95% of the vocabulary of a sentence to comprehend it. So a more interesting list of statistics would be how many verses can one understand 95% of the vocab of if one know a certain number of words. Of course, there&#39;s a lot more to reading comprehension than knowing the vocab. But it was enough for me to decide to write some code yesterday afternoon to run against my [MorphGNT] database.&lt;/p&gt;
&lt;p&gt;To first of all give you a flavour in the specific before moving to the final numbers, consider John 3.16, which is, from a vocabulary point of view, a very easy verse to read.&lt;/p&gt;
&lt;p&gt;To be able to read 50% of it, you only need to know the top 28 lexemes in the GNT. To read 75% you only need the top 85 (up to κόσμος). With the top 204 lexemes, you can read 90% of the verse and only a few more: up to 236 (αἰώνιος) gives you the 95%. The only word you would not have come across learning the top 236 words would be μονογενής but even that is in the top 1,200.&lt;/p&gt;
&lt;p&gt;This example does highlight some of the shortcomings of this sort of analysis. There&#39;s no consideration of necessary knowledge of morphology, syntax, idioms, etc. Nor for the fact that the meaning of something like μονογενής is fairly easy to guess from knowledge of more common words. But I still think it&#39;s much more useful than the pure word coverage statistics I linked to earlier.&lt;/p&gt;
&lt;p&gt;So let&#39;s actually run the numbers on the complete GNT. If you know the top N words, how many verses could you understand 50% of, 75%, 90% or 95% of...&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;               coverage:    any     50%     75%     90%     95%    100%
        vocab
          100             99.9%   91.3%   24.4%    2.1%    0.6%    0.4%
          200             99.9%   96.9%   51.8%    9.8%    3.4%    2.5%
          500             99.9%   99.1%   82.3%   36.5%   18.0%   13.9%
        1,000            100.0%   99.7%   93.6%   62.3%   37.3%   30.1%
        1,500            100.0%   99.8%   97.2%   76.3%   53.5%   44.8%
        2,000            100.0%   99.9%   98.4%   85.1%   65.5%   56.5%
        3,000            100.0%  100.0%   99.4%   93.6%   81.0%   74.1%
        4,000            100.0%  100.0%   99.7%   97.4%   90.0%   85.5%
        5,000            100.0%  100.0%  100.0%   99.4%   96.5%   94.5%
        ALL              100.0%  100.0%  100.0%  100.0%  100.0%  100.0%
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;What this means is &lt;strong&gt;purely from a vocabulary point of view&lt;/strong&gt; if you knew the top 1000 lexemes, then 37.3% of verses in the GNT would be 95% familiar to you.&lt;/p&gt;
&lt;p&gt;I should emphasis that learning vocabulary in frequency order isn&#39;t necessarily the fastest way to get this proportion of readable verses up. I blogged about this fact three years ago, see &lt;a href=&#34;/2004/11/26/programmed-vocabulary-learning-travelling-salesman/&#34;&gt;Programmed Vocabulary Learning as a Travelling Salesman Problem&lt;/a&gt;, for example.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">It is fairly common, in the context of learning vocabulary for a particular corpus like the Greek New Testament, to talk about what proportion of the text one could read if one learnt the top N words.</summary>
  </entry><entry>
    <title type="html">Announcing MorphGNT.org</title>
    <link href="https://jktauber.com/2006/03/12/announcing-morphgntorg/" rel="alternate" type="text/html" title="Announcing MorphGNT.org"/>
    <published>2006-03-12</published>
    <updated>2006-03-12</updated>
    <id>https://jktauber.com/2006/03/12/announcing-morphgntorg</id>
    <content type="html" xml:base="https://jktauber.com/2006/03/12/announcing-morphgntorg/">&lt;p&gt;I&#39;ve &lt;a href=&#34;/2006/01/01/file-system-archaeology-morphgnt/&#34;&gt;hinted before&lt;/a&gt; about Ulrik Petersen and I collaborating on Greek New Testament linguistic endeavours.&lt;/p&gt;
&lt;p&gt;I&#39;m now delighted to announce the website that will be the home of our collaborative work:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;http://morphgnt.org&#34;&gt;http://morphgnt.org&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I&#39;ve transferred my [MorphGNT] files over there and Ulrik has done the same with his Tischendorf 8th and Strong&#39;s Dictionary.&lt;/p&gt;
&lt;p&gt;We&#39;ve been working on a bunch of other stuff for the last few months which will eventually find its way on to that site too.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve &lt;a href=&#34;/2006/01/01/file-system-archaeology-morphgnt/&#34;&gt;hinted before&lt;/a&gt; about Ulrik Petersen and I collaborating on Greek New Testament linguistic endeavours.</summary>
  </entry><entry>
    <title type="html">Bug Fix to Python Unicode Collation Algorithm</title>
    <link href="https://jktauber.com/2006/02/13/bug-fix-python-unicode-collation-algorithm/" rel="alternate" type="text/html" title="Bug Fix to Python Unicode Collation Algorithm"/>
    <published>2006-02-13</published>
    <updated>2006-02-13</updated>
    <id>https://jktauber.com/2006/02/13/bug-fix-python-unicode-collation-algorithm</id>
    <content type="html" xml:base="https://jktauber.com/2006/02/13/bug-fix-python-unicode-collation-algorithm/">&lt;p&gt;See &lt;a href=&#34;/2006/01/27/python-unicode-collation-algorithm/&#34;&gt;Python Unicode Collation Algorithm&lt;/a&gt; for background.&lt;/p&gt;
&lt;p&gt;This version fixes a major bug that prevented the collation algorithm from working properly with any expansions:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;http://jtauber.com/2006/02/13/pyuca.py&#34;&gt;http://jtauber.com/2006/02/13/pyuca.py&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2012-06-21)&lt;/strong&gt;: Now see &lt;a href=&#34;https://github.com/jtauber/pyuca&#34;&gt;https://github.com/jtauber/pyuca&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">See &lt;a href=&#34;/2006/01/27/python-unicode-collation-algorithm/&#34;&gt;Python Unicode Collation Algorithm&lt;/a&gt; for background.</summary>
  </entry><entry>
    <title type="html">Dynamic Interlinears with Javascript and CSS</title>
    <link href="https://jktauber.com/2006/01/28/dynamic-interlinears-javascript-and-css/" rel="alternate" type="text/html" title="Dynamic Interlinears with Javascript and CSS"/>
    <published>2006-01-28</published>
    <updated>2006-01-28</updated>
    <id>https://jktauber.com/2006/01/28/dynamic-interlinears-javascript-and-css</id>
    <content type="html" xml:base="https://jktauber.com/2006/01/28/dynamic-interlinears-javascript-and-css/">&lt;p&gt;After the continuation of a permathread on the b-greek mailing list about the pros and cons of interlinears, I built some quick demonstrations of how CSS and Javascript could be used for dynamic interlinear glosses that would not be possible on the printed page.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;http://jtauber.com/2006/interlinear-demo/plain.html&#34;&gt;Plain&lt;/a&gt; — show static glosses&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://jtauber.com/2006/interlinear-demo/hover.html&#34;&gt;Hover&lt;/a&gt; — show glosses when a word is hovered over&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://jtauber.com/2006/interlinear-demo/toggle.html&#34;&gt;Toggle&lt;/a&gt; — toggle showing a gloss when a word is clicked&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;http://jtauber.com/2006/interlinear-demo/frequency.html&#34;&gt;Frequency&lt;/a&gt; — filter appearance of gloss by frequency&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;They might be interesting as little Javascript tutorials too.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">After the continuation of a permathread on the b-greek mailing list about the pros and cons of interlinears, I built some quick demonstrations of how CSS and Javascript could be used for dynamic interlinear glosses that would not be possible on the printed page.</summary>
  </entry><entry>
    <title type="html">Python Unicode Collation Algorithm</title>
    <link href="https://jktauber.com/2006/01/27/python-unicode-collation-algorithm/" rel="alternate" type="text/html" title="Python Unicode Collation Algorithm"/>
    <published>2006-01-27</published>
    <updated>2006-01-27</updated>
    <id>https://jktauber.com/2006/01/27/python-unicode-collation-algorithm</id>
    <content type="html" xml:base="https://jktauber.com/2006/01/27/python-unicode-collation-algorithm/">&lt;p&gt;My preliminary attempt at a Python implementation of the Unicode Collation Algorithm (UCA) is done and available at:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;http://jtauber.com/2006/01/27/pyuca.py&#34;&gt;http://jtauber.com/2006/01/27/pyuca.py&lt;/a&gt; (old version—see UPDATE below)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This only implements the simple parts of the algorithm but I have successfully tested it using the Default Unicode Collation Element Table (DUCET) to collate Ancient Greek correctly.&lt;/p&gt;
&lt;p&gt;The core of the algorithm, which is what I have implemented, basically just involves multi-level comparison. For example, &lt;em&gt;café&lt;/em&gt; comes before &lt;em&gt;caff&lt;/em&gt; because at the primary level, the accent is ignored and the first word is treated as if it were &lt;em&gt;cafe&lt;/em&gt;. The secondary level (which considers accents) only applies then to words that are equivalent at the primary level.&lt;/p&gt;
&lt;p&gt;The UCA (and my code) also support contraction and expansion. Contraction is where multiple letters are treated as a single unit—in Spanish, &lt;em&gt;ch&lt;/em&gt; is treated as a letter coming between &lt;em&gt;c&lt;/em&gt; and &lt;em&gt;d&lt;/em&gt; so that, for example, words beginning &lt;em&gt;ch&lt;/em&gt; should sort after all other words beginnings with &lt;em&gt;c&lt;/em&gt;. Expansion is where a single letter is treated as though it were multiple letters—in German, &lt;em&gt;ä&lt;/em&gt; is sorted as if it were &lt;em&gt;ae&lt;/em&gt;, i.e. after &lt;em&gt;ad&lt;/em&gt; but before &lt;em&gt;af&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Here is how to use the &lt;strong&gt;pyuca&lt;/strong&gt; module.&lt;/p&gt;
&lt;p&gt;Usage example:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;from pyuca import Collator
c = Collator(&amp;quot;allkeys.txt&amp;quot;)

sorted_words = sorted(words, key=c.sort_key)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;allkeys.txt (1 MB) is available at&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;http://www.unicode.org/Public/UCA/latest/allkeys.txt&#34;&gt;http://www.unicode.org/Public/UCA/latest/allkeys.txt&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;but you can always subset this for just the characters you are dealing with (and you will need to do this if any language-specific tailoring is needed)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2006-02-13)&lt;/strong&gt;: Now see &lt;a href=&#34;/2006/02/13/bug-fix-python-unicode-collation-algorithm/&#34;&gt;bug fix&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2012-06-21)&lt;/strong&gt;: Now see &lt;a href=&#34;https://github.com/jtauber/pyuca&#34;&gt;https://github.com/jtauber/pyuca&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">My preliminary attempt at a Python implementation of the Unicode Collation Algorithm (UCA) is done and available at:</summary>
  </entry><entry>
    <title type="html">File System Archaeology for MorphGNT</title>
    <link href="https://jktauber.com/2006/01/01/file-system-archaeology-morphgnt/" rel="alternate" type="text/html" title="File System Archaeology for MorphGNT"/>
    <published>2006-01-01</published>
    <updated>2006-01-01</updated>
    <id>https://jktauber.com/2006/01/01/file-system-archaeology-morphgnt</id>
    <content type="html" xml:base="https://jktauber.com/2006/01/01/file-system-archaeology-morphgnt/">&lt;p&gt;Some of you will be aware of &lt;a href=&#34;http://ulrikp.org&#34;&gt;Ulrik Petersen&lt;/a&gt;&#39;s &lt;a href=&#34;http://ulrikp.org/Tischendorf&#34;&gt;work&lt;/a&gt; on augmenting Tischendorf&#39;s 8th edition with morphological tags and lemmata, based on work by Clint Yale and Maurice Robinson. Ulrik is also the developer of &lt;a href=&#34;http://emdros.org/&#34;&gt;Emdros&lt;/a&gt;, an open-source text database engine for annotated text.&lt;/p&gt;
&lt;p&gt;The overlap of Ulrik&#39;s interests and work with my own on [MorphGNT] is very exciting and so we&#39;ve started talking about how we might be able to collaborate on some things together.&lt;/p&gt;
&lt;p&gt;To help facilitate this, I&#39;ve spent much of this long weekend so far going through the last 12 years of work on MorphGNT and putting things into Subversion. Because my work on MorphGNT has always been in fits and spurts and has spanned approximately five different desktop machines over the 12 years, it&#39;s required a fair bit of &#34;file system archaeology&#34;.&lt;/p&gt;
&lt;p&gt;The archaeology analogy seems apt because, I&#39;m essentially piecing together a history based on what &#34;layer&#34; I&#39;m finding the files in—e.g. a file on a backup of my website in 2002 probably dates later than those found in the tar balls from when I moved from one machine to another in 1997.&lt;/p&gt;
&lt;p&gt;There&#39;s also an analogy with textual criticism as in some cases I have to look at two files and judge whether a change from A to B or B to A is more likely.&lt;/p&gt;
&lt;p&gt;It&#39;s been a lot of fun, especially uncovering little scripts I wrote back in the nineties to do various analyses.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Some of you will be aware of &lt;a href=&#34;http://ulrikp.org&#34;&gt;Ulrik Petersen&lt;/a&gt;&#39;s &lt;a href=&#34;http://ulrikp.org/Tischendorf&#34;&gt;work&lt;/a&gt; on augmenting Tischendorf&#39;s 8th edition with morphological tags and lemmata, based on work by Clint Yale and Maurice Robinson. Ulrik is also the developer of &lt;a href=&#34;http://emdros.org/&#34;&gt;Emdros&lt;/a&gt;, an open-source text database engine for annotated text.</summary>
  </entry><entry>
    <title type="html">MorphGNT 5.08 Released</title>
    <link href="https://jktauber.com/2005/11/07/morphgnt-508-released/" rel="alternate" type="text/html" title="MorphGNT 5.08 Released"/>
    <published>2005-11-07</published>
    <updated>2005-11-07</updated>
    <id>https://jktauber.com/2005/11/07/morphgnt-508-released</id>
    <content type="html" xml:base="https://jktauber.com/2005/11/07/morphgnt-508-released/">&lt;p&gt;I&#39;m pleased to announce the release of a new version of [MorphGNT], the morphologically parsed Greek New Testament database made available under a Creative Commons license.&lt;/p&gt;
&lt;p&gt;I haven&#39;t put together the change log yet but will shortly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2005-11-08)&lt;/strong&gt;: Change log is now available on [MorphGNT] page.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;m pleased to announce the release of a new version of [MorphGNT], the morphologically parsed Greek New Testament database made available under a Creative Commons license.</summary>
  </entry><entry>
    <title type="html">MorphGNT 5.07 Released</title>
    <link href="https://jktauber.com/2005/08/31/morphgnt-507-released/" rel="alternate" type="text/html" title="MorphGNT 5.07 Released"/>
    <published>2005-08-31</published>
    <updated>2005-08-31</updated>
    <id>https://jktauber.com/2005/08/31/morphgnt-507-released</id>
    <content type="html" xml:base="https://jktauber.com/2005/08/31/morphgnt-507-released/">&lt;p&gt;I&#39;m pleased to announce the release of a new version of MorphGNT, the morphologically parsed Greek New Testament database made available under a Creative Commons license.&lt;/p&gt;
&lt;p&gt;See the [MorphGNT] page for a list of changes (47 changes in 940 places).&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;m pleased to announce the release of a new version of MorphGNT, the morphologically parsed Greek New Testament database made available under a Creative Commons license.</summary>
  </entry><entry>
    <title type="html">Upcoming new MorphGNT</title>
    <link href="https://jktauber.com/2005/08/30/upcoming-new-morphgnt/" rel="alternate" type="text/html" title="Upcoming new MorphGNT"/>
    <published>2005-08-30</published>
    <updated>2005-08-30</updated>
    <id>https://jktauber.com/2005/08/30/upcoming-new-morphgnt</id>
    <content type="html" xml:base="https://jktauber.com/2005/08/30/upcoming-new-morphgnt/">&lt;p&gt;I&#39;m just about to release [MorphGNT] 5.07 and, shortly after that, a major new release I&#39;ll designate 6.07.&lt;/p&gt;
&lt;p&gt;I&#39;ve decided not to reset the minor release number on a new major release to emphasis the fact that 5.07 and 6.07 are identical in the data they have in common, the 6-series just adds some extra data.&lt;/p&gt;
&lt;p&gt;I haven&#39;t yet decided just how much extra data will make it in the 6-series releases, but one new addition will be a column containing the surface form / inflected form / reflex (take your pick of terminology) of each word taken in isolation.&lt;/p&gt;
&lt;p&gt;What do I mean by &#34;taken in isolation&#34;? Well a word like μετά could appear in the text as
μετά μεθ&#39; μετ&#39; or μετὰ depending on the text after it. This new column normalises that to μετά. This happens to also be the lemma so it might not be clear what the extra value is in this case. So consider the text in Matthew 1.20 which reads:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;παραλαβεῖν Μαρίαν τὴν γυναῖκά σου&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Note that τὴν has a grave accent and γυναῖκά has two accents. If you were to ask someone what the accusative singular feminine article is, they&#39;d say τήν not τὴν. Similarly, if you asked someone what the accustive of γυνή is, they&#39;d say γυναῖκα not γυναῖκά. The reason for the differing accentuation in the text is the context: final syllable acute becomes grave unless clause-final and enclitics like σου throw their accent back to the end of the previous word.&lt;/p&gt;
&lt;p&gt;Sometimes you want to treat the variations these cause as distinct, sometimes you don&#39;t. By including the extra column, users of MorphGNT will have the best of both worlds.&lt;/p&gt;
&lt;p&gt;Here is a list of possible differences between the existing text column and the new column:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;existing text may exhibit elision (e.g. μετ&#39; versus μετά)&lt;/li&gt;
&lt;li&gt;existing text may exhibit movable ς or ν&lt;/li&gt;
&lt;li&gt;final-acute may become grave&lt;/li&gt;
&lt;li&gt;enclitics may lose an accent&lt;/li&gt;
&lt;li&gt;word preceding an enclitic may gain an extra accent&lt;/li&gt;
&lt;li&gt;the οὐ / οὐκ / οὐχ alternation&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The new column normalises all these differences.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;m just about to release [MorphGNT] 5.07 and, shortly after that, a major new release I&#39;ll designate 6.07.</summary>
  </entry><entry>
    <title type="html">Using Simulated Annealing to Order Goal Prerequisites</title>
    <link href="https://jktauber.com/2005/08/03/using-simulated-annealing-order-goal-prerequisites/" rel="alternate" type="text/html" title="Using Simulated Annealing to Order Goal Prerequisites"/>
    <published>2005-08-03T13:50:57</published>
    <updated>2005-08-03T13:50:57</updated>
    <id>https://jktauber.com/2005/08/03/using-simulated-annealing-order-goal-prerequisites</id>
    <content type="html" xml:base="https://jktauber.com/2005/08/03/using-simulated-annealing-order-goal-prerequisites/">&lt;p&gt;Back in November, I wrote about &lt;a href=&#34;/2004/11/26/programmed-vocabulary-learning-travelling-salesman/&#34;&gt;programmed vocabulary learning as a travelling salesman problem&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I&#39;m pleased to say I&#39;ve finally cleaned up my Python code and made an initial version available at:&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;http://jtauber.com/2005/08/sa_prereq_ordering.py&#34;&gt;http://jtauber.com/2005/08/sa_prereq_ordering.py&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2005-08-04)&lt;/strong&gt;: You probably don&#39;t want to use the above script. See &lt;a href=&#34;/2005/08/03/ordering-goals-rather-prerequisites/&#34;&gt;Ordering Goals Rather Than Prerequisites&lt;/a&gt; for why, along with a much improved script.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Back in November, I wrote about &lt;a href=&#34;/2004/11/26/programmed-vocabulary-learning-travelling-salesman/&#34;&gt;programmed vocabulary learning as a travelling salesman problem&lt;/a&gt;.</summary>
  </entry><entry>
    <title type="html">Ordering Goals Rather Than Prerequisites</title>
    <link href="https://jktauber.com/2005/08/03/ordering-goals-rather-prerequisites/" rel="alternate" type="text/html" title="Ordering Goals Rather Than Prerequisites"/>
    <published>2005-08-03T13:53:20</published>
    <updated>2005-08-03T13:53:20</updated>
    <id>https://jktauber.com/2005/08/03/ordering-goals-rather-prerequisites</id>
    <content type="html" xml:base="https://jktauber.com/2005/08/03/ordering-goals-rather-prerequisites/">&lt;p&gt;The outcome of my &lt;a href=&#34;/2005/08/03/using-simulated-annealing-order-goal-prerequisites/&#34;&gt;simulated annealing program&lt;/a&gt; is a list of prerequisites to learn along with an indication, every so often, of what new goal has been reached.&lt;/p&gt;
&lt;p&gt;Running on the Greek lexemes of 1John, you might get something starting like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;learn μαρτυρέω
learn θεός
learn ἐν
learn εἰμί
learn ὁ
learn τρεῖς
learn ὅτι
know 230507
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This gives seven prerequisites to learn and then a goal that has been reached (230507 = 1John 5.7). The problem is that two of those words are unnecessary. You only need to learn μαρτυρέω, εἰμί, ὁ, τρεῖς and ὅτι to be able to read 1John 5.7.&lt;/p&gt;
&lt;p&gt;The problem is that the program is ordering prerequisites first and only then establishing at each point what goals (if any) have been achieved.&lt;/p&gt;
&lt;p&gt;I can see two solutions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;write a post-processor that walks through and, at each goal, takes any &#34;unused&#34; prerequisites and postpones them to after that goal.&lt;/li&gt;
&lt;li&gt;change the program to order goals rather than prerequisites and work out the latter from the former&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The second is probably considerably more work but probably ultimately preferred.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE&lt;/strong&gt;: I&#39;m almost embarrassed to report that not only was changing over to ordering goals not as hard to do as I thought, but the particular way I did it performs 200 times faster than my previous prerequisite ordering script. New script is at &lt;a href=&#34;http://jtauber.com/2005/08/sa_goal_ordering.py&#34;&gt;http://jtauber.com/2005/08/sa_goal_ordering.py&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">The outcome of my &lt;a href=&#34;/2005/08/03/using-simulated-annealing-order-goal-prerequisites/&#34;&gt;simulated annealing program&lt;/a&gt; is a list of prerequisites to learn along with an indication, every so often, of what new goal has been reached.</summary>
  </entry><entry>
    <title type="html">Parts of Speech and Number of Accents</title>
    <link href="https://jktauber.com/2005/07/16/parts-speech-and-number-accents/" rel="alternate" type="text/html" title="Parts of Speech and Number of Accents"/>
    <published>2005-07-16</published>
    <updated>2005-07-16</updated>
    <id>https://jktauber.com/2005/07/16/parts-speech-and-number-accents</id>
    <content type="html" xml:base="https://jktauber.com/2005/07/16/parts-speech-and-number-accents/">&lt;p&gt;I thought I&#39;d write a quick Python script to check how many accents were on each of the lemmata in [MorphGNT] 5.06.&lt;/p&gt;
&lt;p&gt;Here are the counts by part of speech and number of accents on lemma:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;|     |  0      |  1      |  2  |
+-----+---------+---------+-----+
| A   |  -      |  9159   |  -  |
| C   |  924    |  17361  |  -  |
| D   |  1592   |  4606   |  -  |
| I   |  -      |  17     |  -  |
| N   |  30     |  28271  |  1  |
| P   |  5433   |  5488   |  -  |
| RA  |  19862  |  4      |  -  |
| RD  |  -      |  1744   |  -  |
| RI  |  -      |  1165   |  -  |
| RP  |  -      |  11584  |  -  |
| RR  |  -      |  1677   |  -  |
| V   |  8      |  28101  |  1  |
| X   |  147    |  844    |  -  |
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Some of the low numbers are definitely errors in the database. Now to investigate...&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2005-07-16)&lt;/strong&gt;: both 2-accent cases were mistakes. The 30 0-accent nouns and 5 of the 0-accent verbs were foreign loan words that intentionally weren&#39;t accented but 3 of the 0-accent verbs were mistakes. The 4 accented articles were the result of crasis with the following noun and the word should probably be analyzed as a noun rather than an article. I guess there&#39;ll be a 5.07 release soon. NOTE: I haven&#39;t looked at the particles, adverbs, conjunctions or prepositions yet.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I thought I&#39;d write a quick Python script to check how many accents were on each of the lemmata in [MorphGNT] 5.06.</summary>
  </entry><entry>
    <title type="html">MorphGNT 5.06 Released</title>
    <link href="https://jktauber.com/2005/07/16/morphgnt-506-released/" rel="alternate" type="text/html" title="MorphGNT 5.06 Released"/>
    <published>2005-07-16</published>
    <updated>2005-07-16</updated>
    <id>https://jktauber.com/2005/07/16/morphgnt-506-released</id>
    <content type="html" xml:base="https://jktauber.com/2005/07/16/morphgnt-506-released/">&lt;p&gt;Well, it&#39;s been about a hundred hours work over the last six months, but I&#39;m pleased to announce the release of a new version of [MorphGNT], the morphologically parsed Greek New Testament database made available under a Creative Commons license.&lt;/p&gt;
&lt;p&gt;Besides some corrections to the text (mostly rho-breathing) and a couple of parsing code changes, this release has a huge number of corrections to the lemmata—160 lemma changes in 465 places. See &lt;a href=&#34;/2005/04/19/current-morphgnt-work/&#34;&gt;this blog entry&lt;/a&gt; for how potential errors for this round of corrections were discovered.&lt;/p&gt;
&lt;p&gt;You can download the new file at:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;http://jtauber.com/2005/morphgnt/ccat-tauber-morphgnt-v5_06.zip&#34;&gt;http://jtauber.com/2005/morphgnt/ccat-tauber-morphgnt-v5_06.zip&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Well, it&#39;s been about a hundred hours work over the last six months, but I&#39;m pleased to announce the release of a new version of [MorphGNT], the morphologically parsed Greek New Testament database made available under a Creative Commons license.</summary>
  </entry><entry>
    <title type="html">MorphGNT Roadmap</title>
    <link href="https://jktauber.com/2005/07/04/morphgnt-roadmap/" rel="alternate" type="text/html" title="MorphGNT Roadmap"/>
    <published>2005-07-04</published>
    <updated>2005-07-04</updated>
    <id>https://jktauber.com/2005/07/04/morphgnt-roadmap</id>
    <content type="html" xml:base="https://jktauber.com/2005/07/04/morphgnt-roadmap/">&lt;p&gt;This month I should be doing another release of my morphologically-parsed Greek New Testament. This will be release 5.06. I thought I&#39;d outline my future plans (as they currently stand).&lt;/p&gt;
&lt;p&gt;At some point, I&#39;ll start doing 6.xx releases. This will involve a format change that includes some more information. I&#39;ll probably continue the 5-series releases for people used to the format. The 5-series data is just a subset of the 6-series data so it&#39;s always possible (and easy) for me to generate a 5 from a 6.&lt;/p&gt;
&lt;p&gt;From Series-7, MorphGNT&#39;s format will likely change dramatically to adopt a graph structure rather than a simple tabular structure. This will enable much greater extensibility and annotation.&lt;/p&gt;
&lt;p&gt;Series-7 will be the last that is based on the CCAT database. From Series-8 onwards, the data will hopefully be completely the results of my own parsing work.&lt;/p&gt;
&lt;p&gt;First things first, though—getting 5.06 out. I&#39;m down to 299 mismatches to resolve.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">This month I should be doing another release of my morphologically-parsed Greek New Testament. This will be release 5.06. I thought I&#39;d outline my future plans (as they currently stand).</summary>
  </entry><entry>
    <title type="html">MorphGNT Update</title>
    <link href="https://jktauber.com/2005/06/10/morphgnt-update/" rel="alternate" type="text/html" title="MorphGNT Update"/>
    <published>2005-06-10</published>
    <updated>2005-06-10</updated>
    <id>https://jktauber.com/2005/06/10/morphgnt-update</id>
    <content type="html" xml:base="https://jktauber.com/2005/06/10/morphgnt-update/">&lt;p&gt;A couple of months ago, I &lt;a href=&#34;/2005/04/19/current-morphgnt-work/&#34;&gt;talked about&lt;/a&gt; the current process I&#39;m going through to identify errors in my morphologically parsed Greek New Testament, [MorphGNT]. By the end of April, I was down to 400 mismatches I needed to check. At the time, I thought I&#39;d be able to finish going through them by the time I left to go to Europe on holiday.&lt;/p&gt;
&lt;p&gt;Unfortunately, I haven&#39;t actually worked on it at all the last month. I&#39;m leaving tomorrow but still have 350 mismatches to check (an estimated 14 hours work).&lt;/p&gt;
&lt;p&gt;Hopefully I&#39;ll get it done some time during July and then I&#39;ll be able to release another version of MorphGNT.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">A couple of months ago, I &lt;a href=&#34;/2005/04/19/current-morphgnt-work/&#34;&gt;talked about&lt;/a&gt; the current process I&#39;m going through to identify errors in my morphologically parsed Greek New Testament, [MorphGNT]. By the end of April, I was down to 400 mismatches I needed to check. At the time, I thought I&#39;d be able to finish going through them by the time I left to go to Europe on holiday.</summary>
  </entry><entry>
    <title type="html">DATR in Python</title>
    <link href="https://jktauber.com/2005/04/19/datr-python/" rel="alternate" type="text/html" title="DATR in Python"/>
    <published>2005-04-19</published>
    <updated>2005-04-19</updated>
    <id>https://jktauber.com/2005/04/19/datr-python</id>
    <content type="html" xml:base="https://jktauber.com/2005/04/19/datr-python/">&lt;p&gt;I &lt;a href=&#34;/2005/01/19/datr-morphgnt-rdf-and-python/&#34;&gt;previously&lt;/a&gt; talked about wanting to implement the lexicon language DATR in Python. Well, I just received an email from Henrik Weber saying that (apparently inspired by my post) he has gone and done an implementation at &lt;a href=&#34;http://pydatr.sourceforge.net/&#34;&gt;http://pydatr.sourceforge.net/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Well done Henrik! I&#39;m looking forward to trying it out and maybe contributing.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I &lt;a href=&#34;/2005/01/19/datr-morphgnt-rdf-and-python/&#34;&gt;previously&lt;/a&gt; talked about wanting to implement the lexicon language DATR in Python. Well, I just received an email from Henrik Weber saying that (apparently inspired by my post) he has gone and done an implementation at &lt;a href=&#34;http://pydatr.sourceforge.net/&#34;&gt;http://pydatr.sourceforge.net/&lt;/a&gt;</summary>
  </entry><entry>
    <title type="html">Current MorphGNT Work</title>
    <link href="https://jktauber.com/2005/04/19/current-morphgnt-work/" rel="alternate" type="text/html" title="Current MorphGNT Work"/>
    <published>2005-04-19</published>
    <updated>2005-04-19</updated>
    <id>https://jktauber.com/2005/04/19/current-morphgnt-work</id>
    <content type="html" xml:base="https://jktauber.com/2005/04/19/current-morphgnt-work/">&lt;p&gt;For the last few months, I&#39;ve been making corrections to [MorphGNT] by attempting to merge an English translation (NASB) marked with Strong&#39;s numbers with my database. Although it&#39;s a tedious process, it&#39;s revealing numerous errors.&lt;/p&gt;
&lt;p&gt;When James Strong compiled his concordance, he assigned a number to every lemma in the underlying Greek text of the King James Version. Other translations are often made available annotated with these Strong&#39;s numbers. &lt;a href=&#34;http://www.zhubert.com&#34;&gt;Zack Hubert&lt;/a&gt; provided me with an electronic text of the NASB translation with Strong&#39;s numbers which I converted to something looking like this:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;010101 record 976
010101 genealogy 1078
010101 Jesus 2424
010101 Messiah 5547
010101 son 5207
010101 son 5207
010101 Abraham 11
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The first column is the book, chapter and verse, the second column is the English word as it appears in the NASB translation and the third column is the Strong&#39;s number. Note that not all words are included.&lt;/p&gt;
&lt;p&gt;I then found an electronic text of Strong&#39;s lexicon and stripped out the formatting and the definitions to just get a list of Strong&#39;s numbers with a transliteration of the Greek lemma:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;1 a
2 Aaron
3 Abaddon
4 abares
5 Abba
6 Abel
7 Abia
8 Abiathar
9 Abilene
10 Abioud
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Finally I took my [MorphGNT] database and extracted the lemmata:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;010101 βίβλος
010101 γένεσις
010101 Ἰησοῦς
010101 Χριστός
010101 υἱός
010101 Δαυίδ
010101 υἱός
010101 Ἀβραάμ
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I then wrote a Python program that attempts to merge the first and third files on the basis of the second. Note that the transliterations in Strong&#39;s lexicon don&#39;t have accents and there is ambiguity too (both epsilon and eta go to &#39;e&#39;). That&#39;s a fairly straightforward part of the join, however, because it can be automated by the script.&lt;/p&gt;
&lt;p&gt;The real challenge comes because:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;NASB versification isn&#39;t the same as the MorphGNT Greek text&lt;/li&gt;
&lt;li&gt;the text underlying the NASB is not the same critical text as that of MorphGNT&lt;/li&gt;
&lt;li&gt;there are errors in each of the files&lt;/li&gt;
&lt;li&gt;there are spelling differences&lt;/li&gt;
&lt;li&gt;there are differences in the granularity of the lemmata&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So my program simply indicates whenever it had trouble performing a match and I have to either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;correct my MorphGNT lemma&lt;/li&gt;
&lt;li&gt;correct (or merely change to my lemma conventions) the Strong&#39;s lexicon file&lt;/li&gt;
&lt;li&gt;correct the NASB-Strong file&lt;/li&gt;
&lt;li&gt;change the verse numbering in the NASB-Strong file&lt;/li&gt;
&lt;li&gt;comment out a particular word that appears in the text underlying the NASB but not the MorphGNT text&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There were initially thousands of exceptions that each required one of these actions. After a number of months, I now have one thousand left. It takes me about 4 hours to make 100 corrections so I still have a little way to go.&lt;/p&gt;
&lt;p&gt;When I&#39;m done, I&#39;ll release a new version of [MorphGNT] with the lemma errors that this task revealed corrected.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">For the last few months, I&#39;ve been making corrections to [MorphGNT] by attempting to merge an English translation (NASB) marked with Strong&#39;s numbers with my database. Although it&#39;s a tedious process, it&#39;s revealing numerous errors.</summary>
  </entry><entry>
    <title type="html">BetaCode to Unicode in Python</title>
    <link href="https://jktauber.com/2005/01/27/betacode-unicode-python/" rel="alternate" type="text/html" title="BetaCode to Unicode in Python"/>
    <published>2005-01-27</published>
    <updated>2005-01-27</updated>
    <id>https://jktauber.com/2005/01/27/betacode-unicode-python</id>
    <content type="html" xml:base="https://jktauber.com/2005/01/27/betacode-unicode-python/">&lt;p&gt;BetaCode is a common ASCII transcription for Polytonic Greek. I&#39;ve been dealing with it for around twelve years. (As an aside, back in 1994, I designed a METAFONT for Polytonic Greek that enabled one to use BetaCode in TeX—I typeset my self-published &lt;em&gt;Index to the Greek New Testament&lt;/em&gt; with it).&lt;/p&gt;
&lt;p&gt;For the last six years, my preference has been to use Unicode, so I wrote a program (initially in Java but then in Python) that used a &lt;em&gt;Trie&lt;/em&gt; to represent the multiple BetaCode characters that can map to a single pre-composed Unicode character.&lt;/p&gt;
&lt;p&gt;I&#39;ve had a version available on this site since 2002, but I&#39;ve now updated it to what I&#39;ve been using for my most recent work. You can download it at &lt;a href=&#34;http://jtauber.com/2004/11/beta2unicode.py&#34;&gt;http://jtauber.com/2004/11/beta2unicode.py&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;At some stage I&#39;ll better factor out the conversion pairs so the code is useful for other conversions. The Trie code might be useful for other contexts too.&lt;/p&gt;
&lt;p&gt;(Also see Ricoblog&#39;s &lt;a href=&#34;http://www.supakoo.com/rick/ricoblog/PermaLink,guid,c13cfcd6-92de-4f5d-8256-400e45c5e25d.aspx&#34;&gt;Converting Greek Beta Code into Normalized Unicode&lt;/a&gt;.)&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">BetaCode is a common ASCII transcription for Polytonic Greek. I&#39;ve been dealing with it for around twelve years. (As an aside, back in 1994, I designed a METAFONT for Polytonic Greek that enabled one to use BetaCode in TeX—I typeset my self-published &lt;em&gt;Index to the Greek New Testament&lt;/em&gt; with it).</summary>
  </entry><entry>
    <title type="html">DATR, MorphGNT, RDF and Python</title>
    <link href="https://jktauber.com/2005/01/19/datr-morphgnt-rdf-and-python/" rel="alternate" type="text/html" title="DATR, MorphGNT, RDF and Python"/>
    <published>2005-01-19</published>
    <updated>2005-01-19</updated>
    <id>https://jktauber.com/2005/01/19/datr-morphgnt-rdf-and-python</id>
    <content type="html" xml:base="https://jktauber.com/2005/01/19/datr-morphgnt-rdf-and-python/">&lt;p&gt;I&#39;ve been revisiting &lt;a href=&#34;http://www.datr.org/&#34;&gt;DATR&lt;/a&gt;, the lexical knowledge representation language, as a possible format for the next generation of [MorphGNT]. I was previously considering developing my own RDF/graph-based format but I suddenly remembered DATR from my student days and it makes a lot more sense to use it rather than try to build my own.&lt;/p&gt;
&lt;p&gt;Looking at DATR material, I haven&#39;t seen anything more recent than 1998 so I&#39;m not sure if it&#39;s still the state-of-the-art. It&#39;s a natural fit for some kind of RDFization, something I&#39;m sure I&#39;ll eventually end up doing if someone hasn&#39;t already.&lt;/p&gt;
&lt;p&gt;Of course, I&#39;ll have to write Python code to manipulate DATR. Again, unless some already exists. But I&#39;m almost hoping not as I love implementing specs, especially using test-driven development.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE 2005-04-19&lt;/strong&gt;: Now see &lt;a href=&#34;/2005/04/19/datr-python/&#34;&gt;DATR in Python&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve been revisiting &lt;a href=&#34;http://www.datr.org/&#34;&gt;DATR&lt;/a&gt;, the lexical knowledge representation language, as a possible format for the next generation of [MorphGNT]. I was previously considering developing my own RDF/graph-based format but I suddenly remembered DATR from my student days and it makes a lot more sense to use it rather than try to build my own.</summary>
  </entry><entry>
    <title type="html">Thoughts on GNT-NET Parallel Glossing Project</title>
    <link href="https://jktauber.com/2004/12/14/thoughts-gnt-net-parallel-glossing-project/" rel="alternate" type="text/html" title="Thoughts on GNT-NET Parallel Glossing Project"/>
    <published>2004-12-14</published>
    <updated>2004-12-14</updated>
    <id>https://jktauber.com/2004/12/14/thoughts-gnt-net-parallel-glossing-project</id>
    <content type="html" xml:base="https://jktauber.com/2004/12/14/thoughts-gnt-net-parallel-glossing-project/">&lt;p&gt;Zack Hubert &lt;a href=&#34;http://zhubert.com/node/view/20&#34;&gt;mentions&lt;/a&gt; that I&#39;m thinking about using the &lt;a href=&#34;http://bible.org/&#34;&gt;NET Bible&lt;/a&gt; for a collaborative parallel glossing project.&lt;/p&gt;
&lt;p&gt;Here is how it might work:&lt;/p&gt;
&lt;p&gt;The user is presented with the Greek text and the NET text.&lt;/p&gt;
&lt;p&gt;Consider Luke 1.1. The Greek reads:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Ἐπειδήπερ πολλοὶ ἐπεχείρησαν ἀνατάξασθαι διήγησιν περὶ τῶν πεπληροφορημένων ἐν ἡμῖν πραγμάτων,&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The NET reads&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Now many have undertaken to compile an account of the things that have been fulfilled among us,&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It should be possible to select any number of words in the Greek and any number of words from the NET and assert that they correspond (or link) to one another. There is no need to link between the entire verse of Greek and the entire verse of the NET because that link has already been made automatically.&lt;/p&gt;
&lt;p&gt;Say the user selects Ἐπειδήπερ. They should then be shown the part-of-speech and parse information for the word (in this case C) as well as the lexical form, ἐπειδήπερ. The user should also be shown all previous glosses for ἐπειδήπερ in other contexts.&lt;/p&gt;
&lt;p&gt;The user is then instructed to select the word or words that directly translate ἐπειδήπερ. In this case, the user selects &lt;em&gt;Now&lt;/em&gt; and submits.&lt;/p&gt;
&lt;p&gt;The user need not progress in order. Say the next thing they select is the word πραγμάτων. As before, they are shown the part-of-speech and parse information (N-GPN) and the lexical form, πρᾶγμα. Again the user is show previous glosses. These glosses should include those specifically for πραγμάτων as well as other forms of πρᾶγμα, perhaps displayed differently.&lt;/p&gt;
&lt;p&gt;The user then selects &lt;em&gt;things&lt;/em&gt; and submits.&lt;/p&gt;
&lt;p&gt;It should be possible to select multiple Greek words and link them to just one word from NET. It should also be possible to select one Greek word and link it to multiple words in the NET. Many-to-many links should also be possible. For example, a user could select περὶ τῶν πεπληροφορημένων ἐν ἡμῖν πραγμάτων and &lt;em&gt;of the things that have been fulfilled among us&lt;/em&gt; and submit that linkage.&lt;/p&gt;
&lt;p&gt;It is also possible that some words won’t link to anything.&lt;/p&gt;
&lt;p&gt;Many-to-many linkages should be encouraged where the particular sense of a word is entirely determined by its use in a sequence (such as an idiom).&lt;/p&gt;
&lt;p&gt;Users should be discouraged from doing many-to-many linkages where the sequence isn&#39;t a grammatical unit such as a phrase. For example, a user shouldn&#39;t submit a link between περὶ τῶν and &lt;em&gt;of the&lt;/em&gt;. This clearly can&#39;t be enforced.&lt;/p&gt;
&lt;p&gt;Users should be required to log in before they can submit linkages. Each linkage will be stored with the email address of the person that made the linkage.&lt;/p&gt;
&lt;p&gt;While users may be encouraged to work on particular verses, they should be free to go to whatever verses interest them. Duplicate effort is not a problem and provides redundancy. The data can be checked later for inconsistencies.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Zack Hubert &lt;a href=&#34;http://zhubert.com/node/view/20&#34;&gt;mentions&lt;/a&gt; that I&#39;m thinking about using the &lt;a href=&#34;http://bible.org/&#34;&gt;NET Bible&lt;/a&gt; for a collaborative parallel glossing project.</summary>
  </entry><entry>
    <title type="html">MorphGNT v5.05 Available</title>
    <link href="https://jktauber.com/2004/12/14/morphgnt-v505-available/" rel="alternate" type="text/html" title="MorphGNT v5.05 Available"/>
    <published>2004-12-14</published>
    <updated>2004-12-14</updated>
    <id>https://jktauber.com/2004/12/14/morphgnt-v505-available</id>
    <content type="html" xml:base="https://jktauber.com/2004/12/14/morphgnt-v505-available/">&lt;p&gt;Various corrections.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Corrected occurrence of ἐμβάλλω for lemma instead of ἐμβλέπω or ἐμβαίνω (thanks to Ted Blakley via Zack Hubert)&lt;/li&gt;
&lt;li&gt;Denormalized variant spellings of Ναζαρά&lt;/li&gt;
&lt;li&gt;Corrected parse codes of κἀκεῖνος, θρόνοι&lt;/li&gt;
&lt;li&gt;Added comparative parse code for σπουδαιοτέρως&lt;/li&gt;
&lt;li&gt;Changed lemmata for ἀκριβέστερον, περισσότερον, τολμηρότερον&lt;/li&gt;
&lt;li&gt;Changed lemmata for οὕτως, εἵνεκεν, ἑλπίς&lt;/li&gt;
&lt;li&gt;Corrected lemma for ζώνην and ζώνη&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Various corrections.</summary>
  </entry><entry>
    <title type="html">Best Use of MorphGNT So Far</title>
    <link href="https://jktauber.com/2004/12/14/best-use-morphgnt-so-far/" rel="alternate" type="text/html" title="Best Use of MorphGNT So Far"/>
    <published>2004-12-14</published>
    <updated>2004-12-14</updated>
    <id>https://jktauber.com/2004/12/14/best-use-morphgnt-so-far</id>
    <content type="html" xml:base="https://jktauber.com/2004/12/14/best-use-morphgnt-so-far/">&lt;p&gt;Zack Hubert has taken my [MorphGNT] and built a &lt;a href=&#34;http://zhubert.com&#34;&gt;GNT Browser&lt;/a&gt; that blew me away!&lt;/p&gt;
&lt;p&gt;It displays the text in the browser; hover on a word and the lemma and parsing is shown in a pop-up; click on the word and you get a graph of word occurrence by book with the ability to list all occurrences.&lt;/p&gt;
&lt;p&gt;I&#39;ve toyed with web interfaces to the MorphGNT for years but nothing even remotely as slick as this.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Zack Hubert has taken my [MorphGNT] and built a &lt;a href=&#34;http://zhubert.com&#34;&gt;GNT Browser&lt;/a&gt; that blew me away!</summary>
  </entry><entry>
    <title type="html">MorphGNT v5.04 and Beyond</title>
    <link href="https://jktauber.com/2004/12/09/morphgnt-v504-and-beyond/" rel="alternate" type="text/html" title="MorphGNT v5.04 and Beyond"/>
    <published>2004-12-09</published>
    <updated>2004-12-09</updated>
    <id>https://jktauber.com/2004/12/09/morphgnt-v504-and-beyond</id>
    <content type="html" xml:base="https://jktauber.com/2004/12/09/morphgnt-v504-and-beyond/">&lt;p&gt;I&#39;ve released a new version of my [MorphGNT].&lt;/p&gt;
&lt;p&gt;Details of the changes are on the [MorphGNT] page but they all stem from a simple query performed via a Python script: in cases where there is no parse-code (i.e. the word is essentially uninflected), is the text form the same as the lexical form (other than accentuation)?&lt;/p&gt;
&lt;p&gt;In some cases this rule means that new lexical forms need to be provided to allow for spelling variation, rather than the lexical form normalising spelling. This is an editorial decision I&#39;ve made that makes more sense in the larger picture of where I&#39;m going with the MorphGNT.&lt;/p&gt;
&lt;p&gt;The corrections I&#39;m making to the CCAT database are really just a side-effect of my efforts to build an original database of New Testament Greek morphology. I&#39;ll say more about it as it develops but the idea is that surface forms, lexical forms, spelling variations, roots, stems, suppletion, morpho-phonological rules, etc. will all be catalogued with relationships between them expressed as a directed labelled graph.&lt;/p&gt;
&lt;p&gt;Eventually, the MorphGNT will reference into this graph rather than merely give the lemma. There&#39;ll be a partial ordering of nodes in the graph (expressed by a subset of arc types) and so references will be to the node that is as general as can explain the specific surface form.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">I&#39;ve released a new version of my [MorphGNT].</summary>
  </entry><entry>
    <title type="html">MorphGNT v5.03 available</title>
    <link href="https://jktauber.com/2004/12/07/morphgnt-v503-available/" rel="alternate" type="text/html" title="MorphGNT v5.03 available"/>
    <published>2004-12-07</published>
    <updated>2004-12-07</updated>
    <id>https://jktauber.com/2004/12/07/morphgnt-v503-available</id>
    <content type="html" xml:base="https://jktauber.com/2004/12/07/morphgnt-v503-available/">&lt;p&gt;More corrections now and more coming soon.&lt;/p&gt;
&lt;p&gt;Version 5.03 contains a major correction to the lemma PRO; a correction to MYRA; some spelling distinctions ENEKEN/ENEKA, BETHSAIDA(N), GOLGOTHA(N); and case corrections in proper names GERASENOS, STEFANOS, FOROS, TREIS, TABERNE, DIABLOS.&lt;/p&gt;
&lt;p&gt;See [MorphGNT].&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">More corrections now and more coming soon.</summary>
  </entry><entry>
    <title type="html">MorphGNT v5.02 Available</title>
    <link href="https://jktauber.com/2004/12/05/morphgnt-v502-available/" rel="alternate" type="text/html" title="MorphGNT v5.02 Available"/>
    <published>2004-12-05</published>
    <updated>2004-12-05</updated>
    <id>https://jktauber.com/2004/12/05/morphgnt-v502-available</id>
    <content type="html" xml:base="https://jktauber.com/2004/12/05/morphgnt-v502-available/">&lt;p&gt;Some breathing corrections on rho-initial words.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Some breathing corrections on rho-initial words.</summary>
  </entry><entry>
    <title type="html">Programmed Vocabulary Learning as a Travelling Salesman Problem</title>
    <link href="https://jktauber.com/2004/11/26/programmed-vocabulary-learning-travelling-salesman/" rel="alternate" type="text/html" title="Programmed Vocabulary Learning as a Travelling Salesman Problem"/>
    <published>2004-11-26</published>
    <updated>2004-11-26</updated>
    <id>https://jktauber.com/2004/11/26/programmed-vocabulary-learning-travelling-salesman</id>
    <content type="html" xml:base="https://jktauber.com/2004/11/26/programmed-vocabulary-learning-travelling-salesman/">&lt;p&gt;For a while I&#39;ve been interested in how you could select the order in which vocabulary is learnt in order to maximise one&#39;s ability to read a particular corpus of sentences. Or more generally, imagine you have a set of things you want to learn and each item has prerequisites drawn from a large set with items sharing a lot of common prerequisites.&lt;/p&gt;
&lt;p&gt;As an abstract example, imagine you want to be able to read the &#34;sentences&#34;:&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;{&amp;quot;a b&amp;quot;, &amp;quot;b a&amp;quot;, &amp;quot;h a b&amp;quot;, &amp;quot;d a b e c&amp;quot;, &amp;quot;d a g f&amp;quot;}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;where we assume you must first learn each &#34;word&#34;. Further assuming that all sentences are equally valuable to learn, how would you order the learning of words to maximise what you know at any given point in time?&lt;/p&gt;
&lt;p&gt;One approach would be to learn the prerequisites in order of their frequency. So you might learn in an order like&lt;/p&gt;
&lt;pre class=&#34;codehilite&#34;&gt;&lt;code&gt;&amp;lt;a, b, d, c, e, f, g, h&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;However, had we put h before d, we could have had an overall learning programme that, although equal in length by the end, enabled the learner, at the half-way mark, to understand three sentences instead of just two.&lt;/p&gt;
&lt;p&gt;To investigate this further, I needed a way to score a particular learning programme and decided that one reasonable way to do so would be to sum, across each step, the fraction of the overall set of sentences understandable at that point.&lt;/p&gt;
&lt;p&gt;I then needed an algorithm that would find the ordering that would maximise this score.&lt;/p&gt;
&lt;p&gt;After the quick realisation that the number of possible learning programmes was factorial in the number of words, it dawn on me that this was essentially a travelling salesman problem.&lt;/p&gt;
&lt;p&gt;So my sister, Jenni and I wrote a Python script that implements a simulated annealing approach to the TSP. We then applied it to the above contrived example. Sure enough, it found a solution that was better than a straight prerequisite frequency ordering.&lt;/p&gt;
&lt;p&gt;I then decided to try applying it to a small extract of the Greek New Testament (which, of course, [I have in electronic form], already stemmed). So I ran it on the first chapter of John&#39;s Gospel. 198 words and 51 verses. A straight frequency ordering on this text achieves a score of 48 so that was the score to beat.&lt;/p&gt;
&lt;p&gt;My first attempt, it didn&#39;t even come close to that. What a disappointment! Jenni and I wondered if it was just the initial parameters to the annealing model. So we increased the number of iterations at a given temperature to 50 and lowered the final temperature to 0.001 (keeping the initial temperature at 1 and the alpha at 0.9).&lt;/p&gt;
&lt;p&gt;Success!! It found a solution that scored 82.94. The first verse readable (after 27 words) was John 1.34. John 1.20 was then readable after just 2 more words and John 1.4 after another 7.&lt;/p&gt;
&lt;p&gt;I decided to try different parameters. With 100 iterations per temp, a final temp of 0.0001 and a few hours, it achieved a score of 91.59 (and was still increasing at the time). This time the first verse readable was John 1.24, after only 8 words; then John 1.4 after another 9; John 1.10 after 4; and both John 1.1 and John 1.6 after another 4 and John 1.2 just 1 word after that.&lt;/p&gt;
&lt;p&gt;Overall a very promising approach. I doubt it&#39;s anything new but it was fun discovering the approach ourselves rather than just reading about it in some textbook. The example I tested it on was vocabulary learning, but it could apply to anything that can similarly be modelled as items to learn with prerequisites drawn from a large, shared set.&lt;/p&gt;
&lt;p&gt;The next step (besides more optimised code and even more long-running parameters) would be to try to work out how to model layered prerequisites — i.e. where prerequisites themselves have prerequisites — to any number of levels. I haven&#39;t thought yet how (or even whether) that boils down (no pun intended) to a simulated annealing solution to the TSP.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UPDATE (2005-08-03)&lt;/strong&gt;: Now see &lt;a href=&#34;/2005/08/03/using-simulated-annealing-order-goal-prerequisites/&#34;&gt;Using Simulated Annealing to Order Goal Prerequisites&lt;/a&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">For a while I&#39;ve been interested in how you could select the order in which vocabulary is learnt in order to maximise one&#39;s ability to read a particular corpus of sentences. Or more generally, imagine you have a set of things you want to learn and each item has prerequisites drawn from a large set with items sharing a lot of common prerequisites.</summary>
  </entry><entry>
    <title type="html">MorphGNT v5.01 Available</title>
    <link href="https://jktauber.com/2004/11/21/morphgnt-v501-available/" rel="alternate" type="text/html" title="MorphGNT v5.01 Available"/>
    <published>2004-11-21</published>
    <updated>2004-11-21</updated>
    <id>https://jktauber.com/2004/11/21/morphgnt-v501-available</id>
    <content type="html" xml:base="https://jktauber.com/2004/11/21/morphgnt-v501-available/">&lt;p&gt;Found an accent and breathing problem in both the text and lemma for ABEL, ANNA and ANNAS which is now corrected.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on jtauber.com&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">Found an accent and breathing problem in both the text and lemma for ABEL, ANNA and ANNAS which is now corrected.</summary>
  </entry><entry>
    <title type="html">MorphGNT v5.00 Available</title>
    <link href="https://jktauber.com/2004/11/14/morphgnt-v500-available/" rel="alternate" type="text/html" title="MorphGNT v5.00 Available"/>
    <published>2004-11-14</published>
    <updated>2004-11-14</updated>
    <id>https://jktauber.com/2004/11/14/morphgnt-v500-available</id>
    <content type="html" xml:base="https://jktauber.com/2004/11/14/morphgnt-v500-available/">&lt;p&gt;At wildly varying intensities over the last ten years, I’ve worked on correcting the UPenn CCAT Morphological Parsed Greek New Testament as a side-effect of larger linguistic analyses I’ve undertaken.&lt;/p&gt;
&lt;p&gt;The last big burst of activity was in 2002 when I resumed work on my own morphological analysis (starting with the nouns).&lt;/p&gt;
&lt;p&gt;The last couple of weekends, I’ve been working on preparing a new release of the corrected MorphGNT file, the first in probably seven or so years.&lt;/p&gt;
&lt;p&gt;Prompted by a post to the b-greek mailing list, I’ve now made that release. MorphGNT v5.00 is now available at [MorphGNT].&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on &lt;a href=&#34;https://jtauber.com/&#34;&gt;jtauber.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">At wildly varying intensities over the last ten years, I’ve worked on correcting the UPenn CCAT Morphological Parsed Greek New Testament as a side-effect of larger linguistic analyses I’ve undertaken.</summary>
  </entry><entry>
    <title type="html">The Bible and the Semantic Web</title>
    <link href="https://jktauber.com/2004/05/04/bible-and-semantic-web/" rel="alternate" type="text/html" title="The Bible and the Semantic Web"/>
    <published>2004-05-04</published>
    <updated>2004-05-04</updated>
    <id>https://jktauber.com/2004/05/04/bible-and-semantic-web</id>
    <content type="html" xml:base="https://jktauber.com/2004/05/04/bible-and-semantic-web/">&lt;p&gt;For many years I’ve been thinking about the application of Semantic Web technology to studying (and presenting the results of the study of) the Bible. However, I never really thought about the application of Bible study (and the tools and techniques developed for it) to the Semantic Web.&lt;/p&gt;
&lt;p&gt;Then I came across this &lt;a href=&#34;http://leobard.twoday.net/stories/209611/&#34;&gt;great blog entry&lt;/a&gt;, discussing the latter.&lt;/p&gt;
&lt;p&gt;On the former, there is a wonderful site &lt;a href=&#34;http://www.semanticbible.com/&#34;&gt;SemanticBible&lt;/a&gt; that I hope I can contribute to in some way.&lt;/p&gt;
&lt;p&gt;I also really need to get back to my morphological analysis. I haven’t thought about it for a while, but I need to come up with URIs for each lemmata and word form. I could even grandfather in Strong’s numbers and G/K numbers.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;em&gt;originally published on &lt;a href=&#34;https://jtauber.com/&#34;&gt;jtauber.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</content>
    <author>
      <name>James Tauber</name>
    </author>
    <summary type="html">For many years I’ve been thinking about the application of Semantic Web technology to studying (and presenting the results of the study of) the Bible. However, I never really thought about the application of Bible study (and the tools and techniques developed for it) to the Semantic Web.</summary>
  </entry></feed>