pyuca is my pure Python implementation of the Unicode Collation Algorithm (for sorting, amongst other things, Greek).
I’ve just released version 1.0 for Python 3.3 and above, and it passes 100% of the UCA conformances tests.
I implemented enough back in 2006 to be able to sort Ancient Greek and released it on PyPI in 2012.
Since then, with input from others, I’ve made various improvements but in October last year I decided to start testing against the comprehensive UCA conformance tests provided by the Unicode Consortium. The last couple of days I’ve had an intense sprint where I got 100% of the tests passing and also 100% code coverage.
I also made the decision to ditch Python 2 support as part of my encouragement to get people to move to Python 3.
The repo is available at https://github.com/jtauber/pyuca/ but you can most easily get pyuca with
pip install pyuca
and then use it as follows:
from pyuca import Collator c = Collator("allkeys.txt") sorted_words = sorted(words, key=c.sort_key)
UPDATE (2015-05-13): Python 2 support is back in 1.1