I thought I’d write a quick Python script to check how many accents were on each of the lemmata in [MorphGNT] 5.06.
Here are the counts by part of speech and number of accents on lemma:
| | 0 | 1 | 2 |
+-----+---------+---------+-----+
| A | - | 9159 | - |
| C | 924 | 17361 | - |
| D | 1592 | 4606 | - |
| I | - | 17 | - |
| N | 30 | 28271 | 1 |
| P | 5433 | 5488 | - |
| RA | 19862 | 4 | - |
| RD | - | 1744 | - |
| RI | - | 1165 | - |
| RP | - | 11584 | - |
| RR | - | 1677 | - |
| V | 8 | 28101 | 1 |
| X | 147 | 844 | - |
Some of the low numbers are definitely errors in the database. Now to investigate…
UPDATE (2005-07-16): both 2-accent cases were mistakes. The 30 0-accent nouns and 5 of the 0-accent verbs were foreign loan words that intentionally weren’t accented but 3 of the 0-accent verbs were mistakes. The 4 accented articles were the result of crasis with the following noun and the word should probably be analyzed as a noun rather than an article. I guess there’ll be a 5.07 release soon. NOTE: I haven’t looked at the particles, adverbs, conjunctions or prepositions yet.
originally published on jtauber.com