Translations
Translations are a complex topic and can be handled in several ways.
Applications' and libraries' content can either be translated using a custom way (e.g. a translation framework), or by using the system's builtin translation library, i18n
.
Translation sets
Translation sets are a way to specify how to translate a set of content.
They use an extremely compact and efficient format to minimize disk usage and maximize performances. Some trade-offs are made to accelerate performance at the expansve of file size.
They are not made to be modified directly, and should instead be derived from other text formats if the translation sets need to be manually edited by a person.
Header
The header is made of the following informations:
- Number of translatable languages (2 bytes)
- For each language:
- Language's code in ASCII (3 bytes, using the ISO 639-2 format)
- Location of the language's translation table in the file (8 bytes)
- Code of the reference language(3 bytes)
- Number of translation models (8 bytes)
- For each translation model:
- The translation model itself
The reference language is the one the translations are made from. For instance, a translation set may be procuced by translating english models to french. But when the set is bundled, both english and french will actually be written in their own translation tables, meaning we can't easily reverse the process. The reference language is here to ensure that we can automatically determine which language was used to create the translations. It can be also be used in debug messages in case something goes wrong.
Translation models
A translation model is a translatable string. They can use variables which are provided at runtime.
It only consists in a variable set declaration.
Translation models contain no text as that is the role of the translation strings, which means that decoding a translation set will not allow to directly see what the source text is, as there is in fact no text in the models.
Most translation tools provide a way to start from a source language (like english) and then translate to other languages, but when the file is actually bundled, the source language is just written as any other inside its own translation table.
Variable set declaration
A variable set declaration of made of the following:
- Number of variables in the set (2 bytes)
- For each variable:
- Unique identifier of the variable (4 bytes)
- Variable type (1 byte):
0x00
: delimited string (string
)0x01
: boolean (bool
)0x02
: 64-bit signed integer (int
)0x03
: 64-bit unsigned integer (uint
)0x04
: 64-bit floating-pointer number (float
)
Translation tables
A translation table represents the translation of each defined model for a provided language.
It consists in the following:
- For each model defined in the header:
- Relative address of the translation string in this table
- For each model defined in the header:
Translation strings must be in the same order as defined in the header.
Translation strings
A translation string is the dynamic translation, for a given language, of a translation model.
They use a complex format as they permit to achieve both compacity in term of file size, high performance when decoding and actually performing the translations, as well as allowing complex translations. It is for instance possible to link together multiple conditionals to only show a text if a set of provided numbers meet a requirement.
Translation strings use the following structure:
- Length of the translation string, in bytes (4 bytes)
- Assignable variables set
- Number of direct translation segments (4 bytes)
- Relative address of the dynamic library (4 bytes)
- For each translation segment:
- Number of optional translation segments in the dynamic library (4 bytes)
- For each optional translation segments:
- Unique identifier for this segment (4 bytes)
- Translation segment
Translation segments
- Length of the segment in bytes (4 bytes)
- Segment type (1 byte):
0x00
: fixed text- Followed by a delimited string
0x01
: toggle- Followed by the identifier of a boolean variable (4 bytes)
- Followed by the ID of the optional translation segment to use if the variable is truthy
- Followed by the ID of the optional translation segment to use if the variable is falsy
0x02
: comparison- Followed by the ID of the variable to compare (must be of a number type)
- Comparator (1 byte):
0x01
: equal0x02
: not equal0x03
: greater than0x04
: less than0x05
: greater than or equal to0x06
: less than or equal to
- Comparison object (1 byte):
0x01
: direct value- Followed by the raw number (must be of the same number type)
0x02
: variable- Followed by the ID of the variable to compare it to (must be of the same number type)
- Followed by the ID of the optional translation segment to use if the comparison is truthy
- Followed by the ID of the optional translation segment to use if the comparison is falsy
0x03
: property checking- Followed by the ID of the variable to check
- Negation (1 byte)
0x00
: perform the check normally0x01
: revert the check's result
- Check (1 byte):
0x01
: (string
-only): string is empty0x02
: (string
-only): string only contains ASCII characters0x20
: (int
-only): number is positive0x21
: (int
-only): number could be converted to auint
without loss0x30
: (uint
-only): number could be converted to anint
without loss0x31
: (uint
-only): if the number was signed as anint
, it would be negative0x40
: (float
-only): number has a non-zero decimal part0x41
: (float
-only): number could be converted to anint
without loss0x42
: (float-only
): number could be converted to auint
without loss
- Followed by the ID of the optional translation segment to use if the variable is truthy
- Followed by the ID of the optional translation segment to use if the variable is falsy
0x04
: assignments- Identifier of the assignable variable (4 bytes)
- Action type (1 byte):
0x00
: Raw assign- Followed by the value to assign (must be of the exact same type)
- Followed by the identifier of a variable (4 bytes)
0x01
: (bool
-only) Result of a comparison- Followed by a comparison's data (type, comparator, ...)
0x02
: (bool
-only) Result of a property checking- Followed by the check's data (negation, check, ...)
0x10
: (uint
-only): length of a string, in bytes- Followed by the identifier of a
string
variable
- Followed by the identifier of a
0xFF
: empty (used for conditionals)