# [NTG-context] Basic question on Unicode and ConTeXt

Hans Hagen pragma at wxs.nl
Tue Jul 19 10:06:53 CEST 2005

```Mojca Miklavec wrote:

>Hans Hagen wrote:
>
>
>>Mojca Miklavec wrote:
>>
>>
>>
>>>>(concerning eregi-* files: you can define filesynonyms so we need a list of filesynonyms and regimesynonyms)
>>>>
>>>>
>>>>
>>>What do you mean by writing file synonyms? Where would it be used?
>>>
>>>
>>\definefilesynonym  [mojka]  [mojca]
>>\definefilesynonym  [moika]  [mojca]
>>\definefilesynonym  [moica]  [mojca]
>>
>>
>
>Ok, if you are provocating, I'll strike back:
>None of the definitions above are allowed because they don't warn the
>user if he's using the wrong name. They should throw an error instead.
>The only proper way would be to define something like
>
>\setuplabeltext[\s!en][\v!pronouncemyname=moitsa]
>\setuplabeltext[\s!de][\v!pronouncemyname=mojza]
>\setuplabeltext[\s!ru][\v!pronouncemyname=мойца]
>...
>
>

\translate[en=moitsa,de=mojza,ru=мойца]

then -)

>OK. I'll prepare \defineregimesynonym-s proposals, but I still don't
>know what the file synonyms should be used for in this context. The
>user probably doesn't need to care about file names?
>
>
depends on if you want to preload all those vectors (take quite some
memory although i may find a way around that [maybe delayed loading]

>So why not mapping the characters to unicode first and defining the
>mapping from unicode to \TeXcommand only once? regi-* files (at least
>in the meaning they have now) could be prepared automatically by a
>script, less error-prone and without the need to say "Some more
>
>
you mean ...

\defineactivetoken 123 {\uchar{...}{...}}

it is an option but it's much slower and take much more memory

\uchar{2}{33} takes 1 hash pointer and 7 char slots (so probably 8 mem
locations) while \eacute takes one mem location

>Is it possible to switch the regimes in the middle of the document
>(like it is possible to switch the languages)? An example usage would
>be if some input documents (plain text, some older TeX files or
>database entries) are written in some other encoding than the main
>stream.
>(Possibly switching in such a way that no leftovers remain after the
>old encoding is replaced by a new one.)
>
>
switching is possible but in that case  you probably want to set toc/index/etc expansion to yes

Hans

-----------------------------------------------------------------