A few weeks ago, I announced the first release of
text-validator, my pluggable command-line tool for validating the formatting and orthography of text files.
Since then I’ve done a couple of small updates.
In 0.2, I added a validator plugin to test tokens against a list of regular expressions. This is great for catching stray characters.
For example, here’s the configuration I use for my text of the Enchiridion:
TOKEN_REGEXES = [ "\\d+\\.\\d+$", "[«(]*[\u0370-\u03FF\u1F00-\u1FFF]+\u2019?[.,:;»)]*$", ]
In 0.3, I made a small but significant change: the tool now returns a non-zero status code if validation fails. This doesn’t make much difference if you’re just running the tool manually on the command-line but if you’re running it as part of a continuous integration (CI) process, this is vital.
You can read more about
text-validator and how to use it at https://github.com/jtauber/text-validator and the linked-to wiki pages.