-
October 21, 2020 / James Tauber
I started working on some Plato texts a while ago but now I'm back to it, integrating various information and hitting some more issues with the Diorisis corpus.
-
October 4, 2020 / James Tauber
Jesse Egbert's Plenary at JAECS 2020 is giving me a bunch of ideas of things to try on the New Testament and larger Greek corpora. In this post, I briefly explore text dispersion keyness using pericopes as a way of ordering vocabulary.
-
September 14, 2020 / James Tauber
In the previous post, we looked at lemma and token coverage in the works of Plato assuming knowledge of Greek New Testament vocabulary. Here we graphically look at those results and make an important observation.
-
September 2, 2020 / James Tauber
Seumas Macdonald asked me about vocabulary coverage for each work of Plato assuming one has learnt the New Testament vocabulary.
-
August 30, 2020 / James Tauber
Part forty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
August 19, 2020 / James Tauber
Part forty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
August 15, 2020 / James Tauber
Part forty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
August 1, 2020 / James Tauber
Last week, we launched greektyping.com to help people get better at typing Greek. Aurélien Berra asked what the method of choosing words to type was so I thought I'd write a blog post about it.
-
July 28, 2020 / James Tauber
Part forty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 26, 2020 / James Tauber
I've revived an old web application to help people practice typing Ancient Greek.
-
July 6, 2020 / James Tauber
Part forty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
June 29, 2020 / James Tauber
Part forty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
June 15, 2020 / James Tauber
As I slowly expand my plans for a Morphological Lexicon of New Testament Greek to a Morphological Lexicon of Ancient Greek, I'm dealing with extra challenges in lemmatization.
-
May 10, 2020 / James Tauber
Part forty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
March 19, 2020 / James Tauber
I have made a minor update to greek-normalisation
, a more significant update to vocabulary-tools
, and have started a new project postag-convert
for converting between various morphosyntactic tagging schemes.
-
March 17, 2020 / James Tauber
Part forty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
February 24, 2020 / James Tauber
Via an unusual route, I discovered Edward Adolf Sonnenschein and his thoughts at the turn of the 20th century on teaching Latin (and Greek).
-
February 13, 2020 / James Tauber
Last year I started the Python library vocabulary-tools to consolidate the various scripts I've written over the years to analyse vocabulary in (particularly New Testament) texts. I've just added support for the vocabulary in Vanessa Gorman's treebanks.
-
February 4, 2020 / James Tauber
Part forty of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
January 20, 2020 / James Tauber
I've recently started working on cleaning up the Diorisis Ancient Greek Corpus for my own vocabulary and morphology work as well as potential use in Scaife.
-
January 5, 2020 / James Tauber
Part thirty-nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
January 3, 2020 / James Tauber
Part thirty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
January 2, 2020 / James Tauber
Part thirty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
December 31, 2019 / James Tauber
Part thirty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
December 30, 2019 / James Tauber
Part thirty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
December 29, 2019 / James Tauber
Part thirty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
December 28, 2019 / James Tauber
Part thirty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
December 27, 2019 / James Tauber
Part thirty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
December 26, 2019 / James Tauber
Part thirty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
December 12, 2019 / James Tauber
Part thirty of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
December 3, 2019 / James Tauber
A few weeks ago, I announced the first release of text-validator
, my pluggable command-line tool for validating the formatting and orthography of text files.
-
November 29, 2019 / James Tauber
Part twenty-nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
November 21, 2019 / James Tauber
Mounce’s Basics of Biblical Greek Grammar is a very popular modern textbook, with over 400,000 copies sold and now in its fourth edition. There’s a lot one could quibble with around the usual suspects of deponency, aspect, or the general grammar-translation approach but it’s particularly odd when basic (and, as far as I know, uncontroversial) terminology is misused or misunderstood. I’m talking in particular about the way “ablaut” is discussed.
-
November 15, 2019 / James Tauber
This year I've been thinking about (and working on) the representation of lexical information quite a bit.
-
November 11, 2019 / James Tauber
I've released a first version of a pluggable command-line tool for validating the formatting and orthography of text files.
-
November 7, 2019 / James Tauber
Today I'm heading off to Los Angeles to attend the Thirty-First Annual UCLA Indo-European Conference.
-
November 5, 2019 / James Tauber
Long-time readers of this blog know that, along with morphology, a core research area of mine is vocabulary. Prompted by Seumas Macdonald and now as part of the Greek Texts Project, I started putting together some vocabulary coverage statistics for various subcorpora of Greek prose.
-
November 3, 2019 / James Tauber
Following on from my success with it on the Digital Tolkien Project website, I decided to switch to using Jekyll for the generation of jktauber.com as a static site.
-
November 2, 2019 / James Tauber
A twitter conversation led to the creation of a new project to work on annotated Greek texts for language learners.
-
November 1, 2019 / James Tauber
In the last couple of weeks I've done a couple of minor releases of the greek-normalisation
Python library which brings together various code I use to clean up Greek texts and normalise the forms.
-
July 6, 2019 / James Tauber
Here are the conferences I'm attending (and in some cases, presenting at) in June through August. I probably should have posted this at the start of my conference travel, but here it is.
-
July 6, 2019 / James Tauber
For years I’ve had Python code for normalising Greek forms, checking for stray characters, etc. I finally got around to consolidating them in a library.
-
April 30, 2019 / James Tauber
Part twenty-eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
April 20, 2019 / James Tauber
One of my goals for 2019 is to bring more structure to various disperate Greek projects and, as part of that, I’ve started consolidating multiple one-off projects I’ve done around vocabulary coverage statistics and ordering experiments.
-
February 1, 2019 / James Tauber
Exactly three months ago to the day, I announced that Seumas Macdonald and I were working on a corrected, open, digital edition of the Apostolic Fathers based on Lake. That initial work is now complete.
-
January 14, 2019 / James Tauber
In Five Types of Morphological Analysis I outlined five distinct ways of approaching morphological (or potentially any linguistic) analysis. In support of some of these, I have some additional examples from a pair of papers I'm reading and a conference I just attended.
-
December 10, 2018 / James Tauber
People talking about morphological analyses can often speak across each other because they have different purposes in mind. Here's an initial attempt to outline five possibly distinct notions one might be referring to.
-
November 1, 2018 / James Tauber
I'm working with Seumas Macdonald on an open, corrected digital edition of the Apostolic Fathers based on Lake.
-
October 18, 2018 / James Tauber
Part twenty-seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
September 8, 2018 / James Tauber
Part twenty-six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
September 6, 2018 / James Tauber
Last week I attended the ninth International Colloquium on Ancient Greek Linguistics at the University of Helsinki.
-
August 25, 2018 / James Tauber
Part twenty-five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 29, 2018 / James Tauber
Part twenty-four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 23, 2018 / James Tauber
Eliran Wong asked for a more detailed description of the “normalisation” column in MorphGNT so I promised him I’d write a blog post about it.
-
May 26, 2018 / James Tauber
Part twenty-three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
May 16, 2018 / James Tauber
Part twenty-two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
April 25, 2018 / James Tauber
John Lee’s Basics of Greek Accents was released today. Here are some first impressions.
-
March 18, 2018 / James Tauber
I’m off for another string of conferences, this time in Copenhagen, Chicago, and New Orleans.
-
March 10, 2018 / James Tauber
Part twenty-one of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
March 5, 2018 / James Tauber
Part twenty of a tour through Greek inflectional morphology to help get
students thinking more systematically about the word forms they see (and maybe
teach a bit of general linguistics along the way).
-
February 3, 2018 / James Tauber
I’ve finally done the work in translating the MorphGNT tagging system to a new proposal for initial feedback.
-
January 21, 2018 / James Tauber
Measures of dispersion are interesting to apply to a corpus because they tell you whether a word is distributed across parts of the corpus as expected or concentrated more in just some parts. I thought I’d play around with Gries’s DP as a measure of dispersion on the SBLGNT lemmas.
-
December 24, 2017 / James Tauber
I thought I’d help a friend learn some basic Unix command line (although pretty comprehensive for this type of work) with some practical graded exercises using MorphGNT. It worked out well so I thought I’d share in case they are useful to others.
-
November 22, 2017 / James Tauber
I’ve put my two SBL papers this year (from both the recent Annual Meeting and the International Meeting) online and also sync’d my Annual Meeting slides to audio I recorded on my iPhone.
-
November 18, 2017 / James Tauber
I’m again speaking at the SBL Annual Meeting, this time in Boston. My topic is basically the “lemma lattice” work started by Ulrik Sandborg-Petersen and I back in 2006 but which I’ve never presented in this sort of setting before.
-
November 3, 2017 / James Tauber
In his talk on adversive conjunction in Gothic at the 29th UCLA Indo-European Conference, Jared Klein started with a wonderful example paragraph in English.
-
November 2, 2017 / James Tauber
Part nineteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
November 1, 2017 / James Tauber
Tomorrow I’m off to Los Angeles for the Twenty-Ninth Annual UCLA Indo-European Conference.
-
October 27, 2017 / James Tauber
Part eighteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
October 16, 2017 / James Tauber
Part seventeen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
September 25, 2017 / James Tauber
pyuca is my pure-Python implementation of the Unicode Collation Algorithm—a library I use almost every day to properly sort Greek (although the library is not Greek-specific). I was recently asked how to use pyuca with a more recent DUCET than 6.3.0. That led to me needing to make a number of changes to the core code so it now supports 8.0.0, 9.0.0 and 10.0.0 as long as you have the right Python version.
-
September 7, 2017 / James Tauber
Part sixteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
September 5, 2017 / James Tauber
Part fifteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
September 2, 2017 / James Tauber
With a boost in numbers on vocab.oxlos.org, this post looks at some slightly more detailed statistics from the first activity.
-
August 29, 2017 / James Tauber
Part fourteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
August 29, 2017 / James Tauber
Here are some very preliminary statistics from the Greek Vocab site’s first month.
-
August 27, 2017 / James Tauber
I recently saw a nice visualisation of English letter bigram frequencies and decided to replicate it with Greek New Testament data.
-
August 26, 2017 / James Tauber
Part thirteen of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
August 16, 2017 / James Tauber
Part twelve of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
August 5, 2017 / James Tauber
This afternoon I'm heading off to Berlin for my first Society of Biblical Literature International Meeting, where I'll be speaking on adaptive reading environments for Biblical Greek.
-
August 5, 2017 / James Tauber
Last week I launched a site for Greek vocabulary. Here's how the first week has gone.
-
August 3, 2017 / James Tauber
Part eleven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
August 2, 2017 / James Tauber
Part ten of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 29, 2017 / James Tauber
I was thinking about vocabulary differences between books of the New Testament and decided to see what happens when you do a hierarchical clustering analysis of NT books using the Jaccard distance of their lemma sets.
-
July 29, 2017 / James Tauber
I've put together a new little site to host various activities to research vocabulary knowledge and acquisition in the context of Ancient and Biblical Greek.
-
July 23, 2017 / James Tauber
Part nine of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 17, 2017 / James Tauber
Part eight of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 16, 2017 / James Tauber
I’ve thought for a while that “A man walks into a bar” jokes are a great example of how definiteness works in English. I mentioned this to Jonathan Robie in Cambridge and he seemed to like the example too so I thought I’d share it more broadly.
-
July 14, 2017 / James Tauber
Part seven of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 11, 2017 / James Tauber
Part six of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 10, 2017 / James Tauber
I sometimes get people expressing an interest in my Greek reader work or get asked about the status of my "reader" and I have to ask them to clarify which reader they mean. I thought I might do a quick post where I spell out various "reader" projects I have worked on and am working on.
-
July 6, 2017 / James Tauber
Part five of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
July 2, 2017 / James Tauber
Part four of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
June 30, 2017 / James Tauber
Jonathan Robie's Treedown format is a really nice way of conveying basic syntactic structure in real texts. I recently experimented a little with some code for collapsing and expanding of the structure.
-
June 29, 2017 / James Tauber
Part three of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
June 25, 2017 / James Tauber
Part two of a tour through Greek inflectional morphology to help get students thinking more systematically about the word forms they see (and maybe teach a bit of general linguistics along the way).
-
June 25, 2017 / James Tauber
I was here last month but I'm back again for a series of conferences and then my graduation.
-
June 23, 2017 / James Tauber
This is the first post in a (likely long) series exploring the inflectional morphology of Greek. My goal is to work through various aspects of Greek morphology to help students think more systematically about the subject.
-
May 31, 2017 / James Tauber
While most of my focus has been on inflectional morphology, I've done a little bit of work on modelling derivational morphology and it's been a desideratum for my reader and learning algorithm work dating back to at least the original 2008 \"New Kind of Graded Reader\" presentations.
-
May 24, 2017 / James Tauber
An analysis I did of a couple of chapters of Herodotus looks like it might be an interesting example to use for various treebanking approaches—both in terms of how things are structured as well as how they are visualised.
-
May 3, 2017 / James Tauber
Next week I'm headed to Germany for a whirlwind trip to Göttingen, Heidelberg, and Leipzig to share and discuss ideas with other scholars.
-
April 21, 2017 / James Tauber
On my now page, I currently list "finalising an improved set of morphology tags to use" under Medium Term. As I find myself sometimes having to clarify the motivation for and state of this, I thought I'd share what I just wrote in the Biblical Humanities Slack.
-
April 18, 2017 / James Tauber
In a recent post, Update on LXX Progress, I talked about the possibility of putting together a crowd-sourcing tool to help share the load of clarifying some parse code errors in the CATSS LXX morphological analysis. Last Friday, Patrick Altman and I spent an evening of hacking and built the tool.
-
April 17, 2017 / James Tauber
The last couple of weeks, I've been working on getting my greek-inflexion
code working on Ulrik Sandborg-Petersen's analysis of the Nestle 1904. The first pass of this is now done.
-
April 10, 2017 / James Tauber
As mentioned in previous posts, I've been working through the LXX, initially making sure my greek-inflexion
library can generate the same analysis of verbs as the CATSS LXX Morphology and adding to the verb stem database accordingly. This is a preliminary to being able to run the code on alternative LXX editions such as Swete and provide a freely available morphologically-tagged LXX.
-
February 15, 2017 / James Tauber
Over the last few weeks, I've made a number of new releases of the MorphGNT SBLGNT analysis fixing some accentuation issues mostly in the normalization column. This came out of ongoing work on modelling accentuation (and, in particular, rules around clitics).
-
January 2, 2017 / James Tauber
In greek-inflexion and an Update on the Morphological Lexicon I said that all the verbs in the MorphGNT SBLGNT analysis should be done by the end of the year. I hit that goal and made a decent start on the Septuagint.
-
December 4, 2016 / James Tauber
Back in Polytonic Greek Unicode Still Isn’t Perfect and An Updated Solution to Polytonic Greek Unicode’s Problems I talked about problems with stacking vowel length and other diacritics. At least in terms of the font used on this site, the problems are now solved.
-
December 2, 2016 / James Tauber
Exactly seven months ago, I released a generic library, inflexion
, and said I'd soon follow it up with the Greek-specific stuff. While I did open-source the latter on GitHub as greek-inflexion
shortly thereafter, I didn't want to announce it here until it was further along. I'm happy to say it now is.
-
November 26, 2016 / James Tauber
I've put together slides and a voice-over to further explain Greek accent placement from a moraic point-of-view.
-
November 26, 2016 / James Tauber
Three weeks ago I fixed a few bugs in greek-accentuation
and ended up doing three releases (although I only blogged about two at the time). I've now done a fourth bug fix release: 1.0.4.
-
November 7, 2016 / James Tauber
Cleaning up code as part of another bug fix to greek-accentuation
led me to update an old diagram I'd done showing the Greek accentuation possibilities in terms of morae.
-
November 4, 2016 / James Tauber
Hot on the heels of the 1.0.1 bug fix, I've released 1.0.2 with another fix, this time in the persistent accent placement. So I thought I'd explain how persistent accent placement is implemented and what the bug was.
-
November 3, 2016 / James Tauber
A minor bug fix release that fixes a problem with add_necessary_breathing
.
-
September 11, 2016 / James Tauber
Occasionally I get in to conversations about the Greek middle (or voice in general) but I've never written down my thoughts on the topic. Here's an attempt to summarize my current thinking although there's nothing particularly novel about it.
-
July 27, 2016 / James Tauber
greek-accentuation
has finally hit 1.0.0 with a couple more functions and a module layout change.
-
July 24, 2016 / James Tauber
This is part 7 of a series of blog posts about modelling stems and principal part lists and looks in even more detail at the format of the principal parts list in the DCC verbs.
-
July 16, 2016 / James Tauber
This is part 6 of a series of blog posts about modelling stems and principal part lists and looks more precisely at the format of the principal parts list in the DCC verbs.
-
June 26, 2016 / James Tauber
This is part 5 of a series of blog posts about modelling stems and principal part lists and covers the format of the principal parts themselves in the Pratt, Morwood and DCC verb lists.
-
June 22, 2016 / James Tauber
This is part 4 of a series of blog posts about modelling stems and principal part lists and covers the Dickinson College Commentaries (DCC) Greek Core lemmas and issues in merging them with the existing merge of Pratt and Morwood.
-
June 21, 2016 / James Tauber
This is part 3 of a series of blog posts about modelling stems and principal part lists and covers the Morwood lemmas and issues in merging them with Pratt's.
-
June 18, 2016 / James Tauber
This is part 1 of a series of blog posts about modelling stems and principal part lists and covers the three sources of Attic Greek principal parts used to expand and test the Morphological Lexicon.
-
June 18, 2016 / James Tauber
This is part 2 of a series of blog posts about modelling stems and principal part lists and covers the complexities in the notion of a lemma identifying lexical entries, specifically in the Pratt principal parts.
-
June 17, 2016 / James Tauber
This is part 0 of a series of blog posts about modelling stems and principal part lists, particularly for Attic Greek but hopefully more generally applicable. This is largely writing up work already done but I’m doing cleanup as I go along as well.
-
May 19, 2016 / James Tauber
A research career requires publication in peer-reviewed journals but what if some of your scholarly output is in the form of software? The Journal of Open Source Software attempts to solve that by essentially wrapping peer-reviewed software packages up as lightweight papers. My pyuca library was just accepted for publication by the journal.
-
May 4, 2016 / James Tauber
In my post Morphological Parts of Speech in Greek last year, I presented a model of five or six parts of speech based purely on what they inflect for. I just found out Varro suggested similar for Latin over two thousand years ago.
-
May 1, 2016 / James Tauber
Over the last few years, I've worked on a number of iterations of code that can generate Ancient Greek verb forms. I've now broken out the Greek-specific pieces and released a generic library called inflexion.
-
February 19, 2016 / James Tauber
I'm current in Vienna for the International Morphology Meeting.
-
February 9, 2016 / James Tauber
In Polytonic Greek Unicode Still Isn’t Perfect, I enumerated various challenges that still exist with using Polytonic Greek when vowel length needs to be marked. I now have a better appreciation of what solutions are actually realistic.
-
January 28, 2016 / James Tauber
Whether we're talking about fonts, programming languages, keyboard entry or even the command-line, support for polytonic Greek has greatly improved even in the last 10 years much less the 23 years since I've been doing computational analysis of Greek texts.
-
January 18, 2016 / James Tauber
While I write and release a lot of Python code for working with Ancient Greek, it tends to be either throwaway code for data wrangling or fairly specialized code for things like accentuation or inflectional morphology.
-
January 17, 2016 / James Tauber
As part of my explicit annotation of the normalization column in MorphGNT, I started down the rabbit hole of capitalization conventions which led to an interesting experiment with direct speech and the GBI syntax trees.
-
January 16, 2016 / James Tauber
The latest release of MorphGNT (with a corresponding release of the Python library py-sblgnt) fixes some lemmatization issues along with a couple of accent and part-of-speech changes.
-
January 13, 2016 / James Tauber
I recently found out about François Gouin, a sort of proto-Charles Berlitz who wrote (in French) a book called The art of teaching and studying languages, published in 1880 and then translated and published in English in 1892.
-
January 6, 2016 / James Tauber
I'm heading off to the LSA's annual meeting for the first time.
-
December 15, 2015 / James Tauber
Ten years ago, when Ulrik Sandborg-Petersen and I started collaborating, we came up with a way of referencing lexemes that would satisfy both the lumpers and splitters. At the time we wrote a paper that we circulated to a small audience but now it's finally up on Academia.edu.
-
December 15, 2015 / James Tauber
Often it's useful to see whether certain columns in a table can be entirely determined by others. For example, can you unambigously get the lemma from just the form (the answer is no so a more useful question is which forms are ambiguous as to lemma)? Does knowing the part-of-speech help? Here we provide some code and give some examples.
-
November 27, 2015 / James Tauber
Since the Series-6 release, MorphGNT has had a column that normalizes the word forms in the text for contextual things like accent changes, elision, movable nu and capitalization. I thought it would be useful to provide an annotation of exactly what normalization had been done for each word in the text and why.
-
November 23, 2015 / James Tauber
Well, I did it! I blogged a post for every day in the four weeks leading up to my talk at SBL. It was a fantastic motivator but I can't sustain the pace.
-
November 22, 2015 / James Tauber
This morning I gave my talk at SBL 2015 on my Morphological Lexicon project.
-
November 21, 2015 / James Tauber
In anticipation of my SBL talk tomorrow, here's an update on my verbal analysis.
-
November 20, 2015 / James Tauber
I knew that a necessary component of a comprehensive morphological analyzer for Ancient Greek was going to be a library for handling accentuation, so back in January 2014, I started the greek-accentuation
Python library.
-
November 19, 2015 / James Tauber
What is the genitive singular ending for 2nd declension nouns?
-
November 18, 2015 / James Tauber
Back in July and August 2014, I started looking at patterns in the full citation forms of nouns in Danker's Concise Lexicon. My goal was partly to explore, in a systematic way, the relationship between inflectional classes and the information expressed in the common pattern of {nominative form}, {genitive ending}, {article}
. I also wanted to put together a kind of automated test to catch typos and inconsistencies in the lexicon.
-
November 17, 2015 / James Tauber
Text-to-speech is pretty good these days but a lot of people don't realize that operating systems like OS X have support for languages other than English, including Modern Greek. So I thought I'd experiment with using it to read the Greek New Testament.
-
November 16, 2015 / James Tauber
Back in The Core Vocabulary of New Testament Greek I talked about Wilfred Major's 2008 paper on core vocabulary lists for Classical Greek and provided code for producing the same for the Greek New Testament along with some discussion of the results. I didn't actually include the full results, however.
-
November 15, 2015 / James Tauber
Over in the lab section of this site, I've added a little prototype Patrick Altman and I built last night.
-
November 14, 2015 / James Tauber
In Analyzing Nominal Morphology: Part 1, I talked about putting together a list of nominal distinguishers and verifying it on the MorphGNT, generating a per-lexeme theme + distinguisher analysis. Here, I'll outline some further steps I've taken.
-
November 13, 2015 / James Tauber
Over the years, when generating vocab coverage stats or orderings for graded readers, I've used either lemmas or inflected forms as the items being learnt.
-
November 12, 2015 / James Tauber
While much of my work going back 10 years or more was on the nominals, the last few years I've been focused on verbal morphology. I decided that for my SBL paper, however, I'd revisit some of my noun work and ended up exploring some ideas afresh.
-
November 11, 2015 / James Tauber
In my previous post, I talked about the legal / licensing aspects of open linguistic data but there are technical aspects in order for linguistic data to be open too.
-
November 10, 2015 / James Tauber
I don't think I've ever articulated why I favour a Creative Commons CC-BY-SA license on all my New Testament Greek data.
-
November 9, 2015 / James Tauber
Adding another potential readability metric, let's look at the mean log frequency of dependency paths.
-
November 8, 2015 / James Tauber
Exactly two weeks ago I said I'd be blogging every day until my talk at SBL. Well, that's two weeks away so I'm at the half way point. I think the blogging has gone well.
-
November 7, 2015 / James Tauber
Back in April 2014, Brian Renshaw posted a Good Friday Greek Reader. It was presumably manually produced but I knew such things could be generated automatically and so went about building a system to do so.
-
November 6, 2015 / James Tauber
In many Greek morphology projects, I've wanted a way of conveying the surface form of an inflected word while also conveying the underlying components prior to the application of the sandhi rule. A couple of years ago, I came up with a simple representation for inline annotation.
-
November 5, 2015 / James Tauber
The parts of speech in a particular language can be drawn up on the basis of syntactic properties, morphological properties, and/or (perhaps most problematically) semantic properties.
-
November 4, 2015 / James Tauber
In a previous post, we looked at which chapters had the highest mean log frequency of lexemes. The code provided there was applicable to other items, though, so let's now take a look at mean log frequency of forms.
-
November 3, 2015 / James Tauber
A few years ago, I was introduced by Greg Stump to the notion of distinguishers in morphological description. The analysis of inflected forms in terms of theme + distinguisher is a very helpful concept and one that is made use extensively in my ongoing work on New Testament Greek morphology.
-
November 2, 2015 / James Tauber
Release 1.1 of GitHub's Atom Editor fixes a problem I had with using it for polytonic Greek.
-
November 1, 2015 / James Tauber
I think it's confusing that we name the non-indicative tense-forms with the same terms as indicative tense-forms. For example “present indicative” and “present infinitive”. The word “present” doesn't mean the same thing in both cases.
-
October 31, 2015 / James Tauber
Back in July, I thought I'd prototype a REST API for MorphGNT with resources for books, paragraphs, sentences, verses and words.
-
October 30, 2015 / James Tauber
In a 2008 paper, Wilfred Major constructs what he calls the 50% and 80% vocab lists for Classical Greek. That is, the lemmata that account for 50% and 80% respectively of tokens in the Classical Greek corpus. In this post I provide the code for the equivalent for the Greek New Testament and talk about some of the results.
-
October 29, 2015 / James Tauber
With dependency paths calculated for the Greek New Testament, we can use mean dependency depth as a proxy for syntactic complexity.
-
October 28, 2015 / James Tauber
For numerous corpus linguistics applications, it's useful to have a word-level indication of syntax. A presentation by Vanessa and Robert Gorman gave me the idea of using dependency paths for this purpose so I've now calculated them for the GNT based on the GBI syntax trees.
-
October 27, 2015 / James Tauber
One component of many readability measures on texts is the mean log word frequency. Here I do a basic calculation across chapters in the Greek New Testament (with code provided).
-
October 26, 2015 / James Tauber
In various mailing list posts, blog posts and talks, I've shown vocabulary coverage statistics. It's time to update the code to use more recent data and republish the results here.
-
October 25, 2015 / James Tauber
It's exactly four weeks until I'm presenting at the SBL Annual Meeting in Atlanta. As I have a long backlog of posts I've wanted to do for a while, I thought I might try to blog every day between now and my talk on November 22nd.
-
July 15, 2015 / James Tauber
I've just finished up registration for the SBL Annual Meeting. Here's the paper I'll be presenting.
-
July 13, 2015 / James Tauber
As helpful as the GBI Syntax Trees are, I have disagreements with them. Randall and Andi are receptive to feedback but there are very different types of disagreement that can arise in syntactic analysis so I thought I'd start to note down what they are.
-
July 2, 2015 / James Tauber
With one child on each branch identified as the head, a constituent analysis can be converted to a dependency analysis. Fortunately, the GBI syntax trees have an explicit indication of the head, so I went ahead and converted them to a dependency format.
-
May 13, 2015 / James Tauber
Thanks to Chris Beaven, Paul McLanahan and Michal Čihař, Python 2 support is back in pyuca 1.1.
-
May 6, 2015 / James Tauber
BibleTech talks were not recorded but I turned on my iPhone's Voice Memo recording and later sync'd the audio with my slides to make this video.
-
February 1, 2014 / James Tauber
pyuca is my pure Python implementation of the Unicode Collation Algorithm (for sorting, amongst other things, Greek).
-
January 18, 2011 / James Tauber
The last three months, I've been working on rebasing the MorphGNT database off the SBLGNT text rather than the UBS3.
-
April 25, 2010 / James Tauber
A post to the graded-reader mailing list from April 25, 2010.
-
April 14, 2010 / James Tauber
A post to the graded-reader mailing list from April 14, 2010.
-
April 12, 2010 / James Tauber
A post to the graded-reader mailing list from April 12, 2010.
-
March 28, 2010 / James Tauber
Yesterday I gave a talk on the graded reader ideas at BibleTech.
-
April 1, 2008 / James Tauber
A post to the graded-reader mailing list from April 1, 2008.
-
March 29, 2008 / James Tauber
A post to the graded-reader mailing list from March 29, 2008.
-
March 29, 2008 / James Tauber
A post to the graded-reader mailing list from March 29, 2008.
-
March 26, 2008 / James Tauber
A post to the graded-reader mailing list from March 26, 2008.
-
March 26, 2008 / James Tauber
A post to the graded-reader mailing list from March 26, 2008.
-
March 25, 2008 / James Tauber
A post to the graded-reader mailing list from March 25, 2008.
-
March 23, 2008 / James Tauber
A post to the graded-reader mailing list from March 23, 2008.
-
March 23, 2008 / James Tauber
A post to the graded-reader mailing list from March 23, 2008.
-
March 23, 2008 / James Tauber
A post to the graded-reader mailing list from March 23, 2008.
-
March 22, 2008 / James Tauber
Owing to the amount of interest I received about A New Kind of Graded Reader...
-
February 10, 2008 / James Tauber
Back in 2004, I talked about algorithms for optimal vocabulary ordering.
-
January 14, 2008 / James Tauber
I don't think I've mentioned it here before but next week, I'm one of the keynote speakers at the BibleTech 2008 conference in Seattle.
-
November 4, 2007 / James Tauber
It is fairly common, in the context of learning vocabulary for a particular corpus like the Greek New Testament, to talk about what proportion of the text one could read if one learnt the top N words.
-
March 12, 2006 / James Tauber
I've hinted before about Ulrik Petersen and I collaborating on Greek New Testament linguistic endeavours.
-
February 13, 2006 / James Tauber
See Python Unicode Collation Algorithm for background.
-
January 28, 2006 / James Tauber
After the continuation of a permathread on the b-greek mailing list about the pros and cons of interlinears, I built some quick demonstrations of how CSS and Javascript could be used for dynamic interlinear glosses that would not be possible on the printed page.
-
January 27, 2006 / James Tauber
My preliminary attempt at a Python implementation of the Unicode Collation Algorithm (UCA) is done and available at:
-
January 1, 2006 / James Tauber
Some of you will be aware of Ulrik Petersen's work on augmenting Tischendorf's 8th edition with morphological tags and lemmata, based on work by Clint Yale and Maurice Robinson. Ulrik is also the developer of Emdros, an open-source text database engine for annotated text.
-
November 7, 2005 / James Tauber
I'm pleased to announce the release of a new version of [MorphGNT], the morphologically parsed Greek New Testament database made available under a Creative Commons license.
-
August 31, 2005 / James Tauber
I'm pleased to announce the release of a new version of MorphGNT, the morphologically parsed Greek New Testament database made available under a Creative Commons license.
-
August 30, 2005 / James Tauber
I'm just about to release [MorphGNT] 5.07 and, shortly after that, a major new release I'll designate 6.07.
-
August 3, 2005 / James Tauber
Back in November, I wrote about programmed vocabulary learning as a travelling salesman problem.
-
August 3, 2005 / James Tauber
The outcome of my simulated annealing program is a list of prerequisites to learn along with an indication, every so often, of what new goal has been reached.
-
July 16, 2005 / James Tauber
I thought I'd write a quick Python script to check how many accents were on each of the lemmata in [MorphGNT] 5.06.
-
July 16, 2005 / James Tauber
Well, it's been about a hundred hours work over the last six months, but I'm pleased to announce the release of a new version of [MorphGNT], the morphologically parsed Greek New Testament database made available under a Creative Commons license.
-
July 4, 2005 / James Tauber
This month I should be doing another release of my morphologically-parsed Greek New Testament. This will be release 5.06. I thought I'd outline my future plans (as they currently stand).
-
June 10, 2005 / James Tauber
A couple of months ago, I talked about the current process I'm going through to identify errors in my morphologically parsed Greek New Testament, [MorphGNT]. By the end of April, I was down to 400 mismatches I needed to check. At the time, I thought I'd be able to finish going through them by the time I left to go to Europe on holiday.
-
April 19, 2005 / James Tauber
I previously talked about wanting to implement the lexicon language DATR in Python. Well, I just received an email from Henrik Weber saying that (apparently inspired by my post) he has gone and done an implementation at http://pydatr.sourceforge.net/
-
April 19, 2005 / James Tauber
For the last few months, I've been making corrections to [MorphGNT] by attempting to merge an English translation (NASB) marked with Strong's numbers with my database. Although it's a tedious process, it's revealing numerous errors.
-
January 27, 2005 / James Tauber
BetaCode is a common ASCII transcription for Polytonic Greek. I've been dealing with it for around twelve years. (As an aside, back in 1994, I designed a METAFONT for Polytonic Greek that enabled one to use BetaCode in TeX—I typeset my self-published Index to the Greek New Testament with it).
-
January 19, 2005 / James Tauber
I've been revisiting DATR, the lexical knowledge representation language, as a possible format for the next generation of [MorphGNT]. I was previously considering developing my own RDF/graph-based format but I suddenly remembered DATR from my student days and it makes a lot more sense to use it rather than try to build my own.
-
December 14, 2004 / James Tauber
Zack Hubert mentions that I'm thinking about using the NET Bible for a collaborative parallel glossing project.
-
December 14, 2004 / James Tauber
Various corrections.
-
December 14, 2004 / James Tauber
Zack Hubert has taken my [MorphGNT] and built a GNT Browser that blew me away!
-
December 9, 2004 / James Tauber
I've released a new version of my [MorphGNT].
-
December 7, 2004 / James Tauber
More corrections now and more coming soon.
-
December 5, 2004 / James Tauber
Some breathing corrections on rho-initial words.
-
November 26, 2004 / James Tauber
For a while I've been interested in how you could select the order in which vocabulary is learnt in order to maximise one's ability to read a particular corpus of sentences. Or more generally, imagine you have a set of things you want to learn and each item has prerequisites drawn from a large set with items sharing a lot of common prerequisites.
-
November 21, 2004 / James Tauber
Found an accent and breathing problem in both the text and lemma for ABEL, ANNA and ANNAS which is now corrected.
-
November 14, 2004 / James Tauber
At wildly varying intensities over the last ten years, I’ve worked on correcting the UPenn CCAT Morphological Parsed Greek New Testament as a side-effect of larger linguistic analyses I’ve undertaken.
-
May 4, 2004 / James Tauber
For many years I’ve been thinking about the application of Semantic Web technology to studying (and presenting the results of the study of) the Bible. However, I never really thought about the application of Bible study (and the tools and techniques developed for it) to the Semantic Web.