i18n: Add translations #22

Open Jookia opened this issue on 1 Sep 2022 - 0 comments

@Jookia Jookia commented on 1 Sep 2022

Over the past month I've been working towards adding translations but
not fully written out the design or thought process, so here goes.

The requirements for NewLang's translation system is the following:

  • Stable translation keys to aid testing
  • Works on embedded devices
  • Has accessible tooling

The most obvious solution on Linux is to use GNU gettext. Translators are familiar with it, it usese English keys as translation keys and has tools like Weblate or Poedit. But it has some downsides:

The first is that translation keys are unstable. English is used instead which might get changed and require fuzzy matches. This is a headache for testing and also takes up flash space.

The second is that the system for specifying plural system is insane. Instead of using something like ICU does or Unicode where each language has different categories of plurals (zero, one, two, few, many, other) and reading it from a locale database, Gettext puts a C expression in the message file like this and expects you to parse it:

Plural-Forms: nplurals=6; \
    plural=n==0 ? 0 : n==1 ? 1 : n==2 ? 2 : n%100>=3 && n%100<=10 ? 3 \
    : n%100>=11 ? 4 : 5;

I did a quick survey and while Gettext has a proper parser for this, TinyGettext strips all the spaces and uses the expression as a key in a lookup table to detect the language, then uses its own plural mapper. node-gettext just takes the Unicode CLDR table and uses that instead.

Having a sublanguage for expressing an arbitrary type of plural forms is useful, the Unicode CLDR does this. But I'd rather this be generated at build time instead of pushing the complexity on to the device.

The third downside is that the main tools used for editing Gettext translations (POEdit and Weblate for example) don't have an accessibility policy. So while they may work now there's no guarantee they won't have a fancy design makeover that ruins this and that becomes my problem.

My proposal to start with is a simple translation system with no plurals or helper software. Translations would be single files with a ini-file style mapping from key to translation. The correct translation would be picked using the LANG or LC_MESSAGES environment variable on Linux.

Tooling for the format for now would likely just read the English translation and copy new keys to whatever new translation the translator is working on. This would give a decent enough experience for translators who want to work off of the English text.

Labels

Priority
default
Milestone
No milestone
Assignee
No one assigned
1 participant
@Jookia