Posts tagged vocabulary
Ordering Vocabulary by Pericope Dispersion
Jesse Egbert’s Plenary at JAECS 2020 is giving me a bunch of ideas of things to try on the New Testament and larger Greek corpora. In this post, I briefly explore text dispersion keyness using pericopes as a way of ordering vocabulary.
More on Plato After GNT
In the previous post, we looked at lemma and token coverage in the works of Plato assuming knowledge of Greek New Testament vocabulary. Here we graphically look at those results and make an important observation.
Plato Vocabulary Coverage After the New Testament
Seumas Macdonald asked me about vocabulary coverage for each work of Plato assuming one has learnt the New Testament vocabulary.
Vanessa Gorman's Lemmatisation Now in vocabulary-tools
Last year I started the Python library vocabulary-tools to consolidate the various scripts I’ve written over the years to analyse vocabulary in (particularly New Testament) texts. I’ve just added support for the vocabulary in Vanessa Gorman’s treebanks.
Subcorpus Vocabulary Statistics
Long-time readers of this blog know that, along with morphology, a core research area of mine is vocabulary. Prompted by Seumas Macdonald and now as part of the Greek Texts Project, I started putting together some vocabulary coverage statistics for various subcorpora of Greek prose.
Consolidating Vocabulary Coverage and Ordering Tools
One of my goals for 2019 is to bring more structure to various disperate Greek projects and, as part of that, I’ve started consolidating multiple one-off projects I’ve done around vocabulary coverage statistics and ordering experiments.
Lexical Dispersion in the Greek New Testament Via Gries’s DP
Measures of dispersion are interesting to apply to a corpus because they tell you whether a word is distributed across parts of the corpus as expected or concentrated more in just some parts. I thought I’d play around with Gries’s DP as a measure of dispersion on the SBLGNT lemmas.
More Vocabulary Statistics
With a boost in numbers on vocab.oxlos.org, this post looks at some slightly more detailed statistics from the first activity.
Some Initial Vocabulary Statistics
Here are some very preliminary statistics from the Greek Vocab site’s first month.
First Week of New Vocab Site
Last week I launched a site for Greek vocabulary. Here’s how the first week has gone.
New Site for Vocabulary Experiments
I’ve put together a new little site to host various activities to research vocabulary knowledge and acquisition in the context of Ancient and Biblical Greek.
Actual Core Vocab Lists for Greek New Testament
Back in The Core Vocabulary of New Testament Greek I talked about Wilfred Major’s 2008 paper on core vocabulary lists for Classical Greek and provided code for producing the same for the Greek New Testament along with some discussion of the results. I didn’t actually include the full results, however.
The Core Vocabulary of New Testament Greek
In a 2008 paper, Wilfred Major constructs what he calls the 50% and 80% vocab lists for Classical Greek. That is, the lemmata that account for 50% and 80% respectively of tokens in the Classical Greek corpus. In this post I provide the code for the equivalent for the Greek New Testament and talk about some of the results.
Updated Vocabulary Coverage Statistics
In various mailing list posts, blog posts and talks, I’ve shown vocabulary coverage statistics. It’s time to update the code to use more recent data and republish the results here.
My BibleTech 2010 Talk
Yesterday I gave a talk on the graded reader ideas at BibleTech.
Vocab Coverage Table for a Better Ordering
A post to the graded-reader mailing list from March 29, 2008.
GNT Verse Coverage with Frequency Ordering
A post to the graded-reader mailing list from March 25, 2008.
GNT Verse Coverage Statistics
It is fairly common, in the context of learning vocabulary for a particular corpus like the Greek New Testament, to talk about what proportion of the text one could read if one learnt the top N words.
Programmed Vocabulary Learning as a Travelling Salesman Problem
For a while I’ve been interested in how you could select the order in which vocabulary is learnt in order to maximise one’s ability to read a particular corpus of sentences. Or more generally, imagine you have a set of things you want to learn and each item has prerequisites drawn from a large set with items sharing a lot of common prerequisites.