This directory contains encoding maps for some selected encodings.

These maps were generated by a perl script, make_encmap, from mapping
information contained on the Unicode version 2.0 CD-ROM. This CD-ROM comes
with the Unicode Standard reference manual and can be ordered from the
Unicode Consortium at http://www.unicode.org. Theses mappings are also
available from the Internet at ftp://ftp.unicode.org/Public/MAPPINGS.

If you edit the generated XML file to add the "expat='yes'" to the
encmap start tag, then you can use the compile_encoding script to check
whether the map meets expat requirements (and also create the corresponding
binary encmap file.)

The file encmap.dtd is the document type declaration for these files
and contains information about the semantics. This should give you
sufficient information to build your own encoding map. I can't vouch
for the validity of the DTD, since I haven't processed it. It is provided
for informational purposes only.

As mentioned in the DTD, there are some restrictions on what kinds of
encodings can be loaded due to restrictions that the expat library places
on us for efficiency reasons.

One of those restrictions is that the encoding must represent the ASCII
set of characters with a single byte and that byte must be equal to the
equivalent Unicode scalar value with the exception of a few punctuation
characters.

So although I have a map for ISO-8859-6 here, it is not an expat mode
map, since it encodes the ASCII digits to their Arabic-Indic equivalents
(U+0660 - U+0669).

Clark Cooper
Novembar 28, 1998



