I’ve released a first version of a pluggable command-line tool for validating the formatting and orthography of text files.
Various text projects like the apostolic-fathers have sometimes included little custom scripts I’ve written to validate the files. Is the Unicode normalised? Are there stray characters or bad line endings? Are references in a valid format?
I also had started included some Greek-specific tests in the greek-normalisation library.
But starting the greek-texts project, I decided it would be nice to have a generic framework for writing text file validators that could be used for all sorts of projects and files.
The result is
text-validator. Think of it like a code linter but for your text files.
Each validator is its own Python module and, while a few basic tests are included in the library, the idea is that third parties can write their own validators and make them installable Python packages for others to use.
pip install text-validator
as well as installing any third-party plugins you want to use.
You then config your validator plugins with a TOML file like:
["text_validator.plugins.whitespace"] CHECK_CRLF = true CHECK_TABS = true CHECK_TRAILING_WHITESPACE = true CHECK_NO_EOF_NEWLINE = true
and run the command
validate-text to run your suite of configured plugins on the files in your text project.
Create issues in the GitHub repository if you have particular validators you like to see or would like to contribute.
I haven’t tried it yet but I’d like to try hooking
text-validator up as a test that gets run on commits and pull requests on GitHub as part of a CI process.