[dev-context] Turkish hyphenation patterns

Mojca Miklavec mojca.miklavec.lists at gmail.com
Mon May 22 16:55:21 CEST 2006


Hello Hans,

Considering the Turkish hyphenation patterns (which don't seem to be
EC-encoded): I tried to decipher them, although some native speaker
should probably take a look.

I'm sending the original documentation and a kind of list of replacements.
I would suggest to replace the following letters (see the attached file):
    i,i:,o:,u:,a=,i=,o=,c:,g=,s:
and to ignore the first three patterns and all those containing any of
the following sequences (they seem to be there just for the sake of
Arabic transliteration):
    @,#,g:,z:,d!,h!,k!,s!,t!,z!,d=,n=,t=,z=,h=,s=

(Or to regenerate the patterns with from scratch with a ruby script,
which might be faster :)
    ftp://tug.ctan.org/pub/tex-archive/language/turkish/hyphen/turk_hyf.c

Mojca
-------------- next part --------------

% part of alphabet:
%    a, e, o, u

i  ı % dotlessi
i: i % i
o: ö % odiaeresis
u: ü % udiaeresis

% not part of alphabet, but sometimes used:

a= â % acircumflex
i= î % icircumflex
o= ô % ocircumflex


% part of alphabet:
%    b, c, d, f, g, h, j, k, l, m, n, p, r, s, t, v, y, z

c: ç % ccedilla
g= ÄŸ % gbreve
s: ÅŸ % scedilla

% --------------
% probably to be ignored

% others; http://en.wikipedia.org/wiki/Arabic_alphabet

@  Ø¡ % 0621 / afii57409 / ARABIC LETTER HAMZA
#    % ayn??? - no idea what this should be

g: Ä¡ % gdotaccent
z: ż % zdotaccent

d! ḍ % 1E0D / ddot[below/accent] / might not have the name yet
h! ḥ % 1E25 / hdot[below/accent]
k! ḳ % 1E33 / kdot[below/accent]
s! á¹£ % 1E63 / sdot[below/accent]
t! á¹­ % 1E6D / tdot[below/accent]
z! ẓ % 1E93 / zdot[below/accent]

d= ḏ % 1E0F / LATIN SMALL LETTER D WITH LINE BELOW
n= ñ % ntilde
t= ṯ % 1E6F / LATIN SMALL LETTER T WITH LINE BELOW
z= ẕ % 1E95 / LATIN SMALL LETTER Z WITH LINE BELOW

h=   % h with line below: not in unicode; perhaps h with breve below (1E2B)?
s=   % s with line below: not in unicode; perhaps scaron?


-------------- next part --------------
A non-text attachment was scrubbed...
Name: turkish-hyphens.pdf
Type: application/pdf
Size: 71255 bytes
Desc: not available
Url : http://www.ntg.nl/mailman/private/dev-context/attachments/20060522/4f5e82c1/attachment-0001.pdf 


More information about the dev-context mailing list