Tools for generating hyphenation patterns

Piet Tutelaers, a member of the NTG Workgroup Spelling, developed a set of tools that he used to develop the new hyphenation patterns for the Dutch language based upon the Groene boekje 1996. The sorting and merging tools expect ISO Latin1 sorted word lists. For the benefit of MSDOS users who work in ANSI the Perl script lat2ansi.pl is available. You can download the tools and use them at your own risk.

Overview of the available tools:

c2ascii:
a very simple tool to convert 8-bits to 7-bits ASCII. If the character in question has a code greater than 128 the character will be converted to ^^xx (where xx is its hexadecimal representation).
diacrit:
a program that shows the ISO Latin1 character set.
hyphenate:
this program can generate a hyphenation dictionary by applying TeX hyphenation patterns to a dictionary.
hyphens.pl:
A Perl script to generate a hyphenated words given hyphenation patterns, with possible exceptions, and a list of unhyphenated words. The program correctly handles Dutch words with 'apostroph' and 'koppelteken'. The hyphen is represented as a high dot because our language uses the - inside words (na-apen).
latin2pc.pl:
a Perl script for converting ISO Latin1 files to/from IBM PC code page 437.
worddiff:
this program generates all words from wordlist one that are not present in the second wordlist (set difference).
wordints:
this program generates the intersection between two wordsorted wordlists.
wordsort:
this program sorts a dictionary containing accents and hyphens using dictionary order (only letters and digits).
wordmerge:
this program merges two or more lists of words (sorted!) into a larger dictionary (also sorted).
wordmatch:
this program compares a dictionary with hyphenated words against a file containing the same words. With this program you can compute how a given set of patterns behave.