A Tour of Greek Morphology: Part 45

Part forty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).

We've classified aorist active endings into three classes:

alphathematic (first aorists including kappa, sigmatic, and pseudo-sigmatic)
thematic (second aorists)
root

It's important to stress that this is a classification of distinguisher paradigms. It is related to but distinct from other ways of classifying aorists based on the properties of the stem and how it relates to the imperfective (present) stem. We'll get to those other ways in a few of posts' time but for now, our classification is just based on the distinctive set of endings.

As we've done before, we'll now take this classifcation and look at various counts in the SBLGNT. How many times do we encounter tokens of each class? How many different lemmas are in each class? Which paradigm cells are most common for each class?

Let's start with just the lemma and token counts as well as the number of lemmas that only occur once in the SBLGNT text.

class	# lemmas	# tokens	# hapakes
alphathematic	661	2973	326
thematic	103	2082	33
root	36	262	15

There is more lexical variety in the alphathematic class, especially when compared with the thematic class. This can be seen in the token-lemma ratio and in the percentage of lemmas that are hapakes.

class	token-lemma ratio	% hapakes
alphathematic	4.50	49.3 %
thematic	20.21	32.0 %
root	7.28	41.7 %

Another way to see this is what % of tokens are forms of the top % of lemmas.

	5%	10%	25%	50%
alphathematic	44.1%	57.4%	76.1%	88.7%
thematic	60.7%	76.8%	89.6%	96.4%
root	21.8%	47.7%	80.5%	92.0%

This table is saying that the top 5% of lemmas with alphathematic forms make up 44.1% of alphathematic tokens but the top 5% of lemmas with thematic forms make up 60.7% of thematic tokens.

In other words, the thematic aorist active tokens are drawn from a smaller set of lemmas than the alphathematic. In fact, a third of thematic aorist active tokens in SBLGNT are forms of εἶπον (and, as we'll see in a moment, mostly 3SG).

One interesting anomaly perhaps worth coming back to at some stage (I wasn't aware of it until now) is that at the top 5% and top 10% lemma level, the root aorists token % is lower than the alphathematic but at the 25% and 50% level is above.

Okay, that's distribution across the three classes of ending. What about individual paradigm cell counts?

	alphathematic	thematic	root
INF	509	351	95
1SG	224	163	12
2SG	88	30	3
3SG	1244	1143	94
1PL	80	45	3
2PL	119	40	13
3PL	709	310	42

In all cases, the infinitive and third person dominate.

It is interesting that in the alphathematics, 3SG dominates with 3PL next and then INF. In the thematics, 3SG dominates even more followed by INF with 3PL not far behind. In the root aorists, the INF is actually up with the 3SG with 3PL a distant third. Recall the μι verbs have a root form in the INF but nowhere else. This likely explains why the INF makes up such a large proportion of root form tokens.

Within the 1st and 2nd person cells, the 1SG dominates in the alphathematic and especially the thematic. In the root, the 2PL is actually on par with the 1SG.

Again this is worthy of closer inspection but there are definitely individual lexical items at work here.

As we've done before, let's look at which lemmas (if any) dominate particular cell paradigm counts.

	thematic	root
INF		δοῦναι 33/95
1SG	εἶδον 54/163	ἔγνων 6/12 ἀνέβην 3/12
2SG	εἶδες 8/30	ἔγνως 3/3
3SG	εἶπε(ν) 610/1143
1PL		ἐνέβημεν 1/3 ἐπέγνωμεν 1/3 ἐξέστημεν 1/3
2PL	ἐλάβετε 13/40	ἀνέγνωτε 10/13
3PL		ἔγνωσαν 17/42

Consistent with its greater lexical variety, the alphathematic cells are not dominated by any one lexical item at all.

In the thematics, though, we see the disproportionate occurrence of εἶδον in the 1SG and 2SG and especially of εἶπον in the 3SG where it makes up more than half the occurrences of thematic 3SG aorist actives.

Note that no root aorist lemma dominates the 3SG cell but all the other cells have a small set of lemmas covering a lot of occurrences. ἔγνως is the only root 2SG form in the SBLGNT, and ἀνέγνωτε makes up 77% of root 2PL occurences.

One thing that might be slightly misleading about the lemma numbers for the thematic and (especially) root aorists is inclusion of compound verbs with preverbs. The 103 thematic aorist active lemmas actually come from 27 base verbs (there are 16 lexical items just from ἔρχομαι/ἦλθον for example). The 36 root aorist active lemmas actually come from just 7 base verbs and 3 of those (δίδωμι, τίθημι, and ἀφ-ίημι) only have a root ending in the infinitive.

So the only fully root verbs in the SBLGNT are the γνω family, the βη family, the στη family, and δυ. With the exception of δυ which has only one instance, the rest have reasonable token counts (82 for γνω, 71 for βη, 55 for στη).

The thematic aorist base verbs with the highest token counts are: the εἰπ family (689), the ἐλθ family (538), εἰδ (178), the λαβ family (125), the ἀγαγ family (71), the βαλ family (70), the ἀπο-θαν family (67), εὑρ (58), the πεσ family (57).

Next up we'll look at the aorist middles again.