[Trennmuster] dehyph*.tex und dehyph*.pat
Mojca Miklavec
mojca.miklavec.lists at gmail.com
Mi Jul 23 07:51:19 CEST 2014
2014-07-21 20:48 GMT+02:00 Guenter Milde <milde at users.sf.net>:
> Liebe Trennmustler,
>
> noch einmal zur pattern-file Erzeugung und Verpackung:
>
> Gibt es einen Grund für die gegenwärtige Aufteilung des Inhalts von
> dehyph*.pat und dehyph*.tex?
>
> Anders gefragt: könn(t)en wir dies Aufteilung so verändern, daß in der
> Pattern-Datei dehyph*.pat wirklich nur die Pattern sind?
We discussed exactly the same question when we started the project in
2008: should we use .tex files or "pure" patterns, one pattern per
line, like in OpenOffice.
We ended up with tex files because many developers/authors add a lot
of useful comments. And the manually generated patterns are often
ordered nicely. We felt it would be bad to force users to drop that
meta information, so we make a compromise and decided to use/allow
using just \patterns{...} in that file (with two exceptions) to still
allow easy parsing of the file.
Jonathan Kew even wrote the TeX code that ignored the first line with encoding.
>From the point of view of hyph-utf8 I don't care what you decide to
use. OK, it's a tiny bit easier to just copy the relevant file, but
changing it wouldn't be a rocket science, so if you would prefer "the
clean" approach, that's still fine for us.
> * TeX ließt dehyph*.tex, welches dann dehyph*.pat per \input includiert.
>
> Wenn man den Kopf von dehyph.pat bis
>
> \patterns{%
>
> und den Fuß aus daten/dehy*.3 nach dehyph*.tex schiebt müßte das doch für
> TeX egal sein.
>
> * Viele Anwendungsprogramme erwarten nur die Pattern und verschlucken
> sich am "TeX-Wrapper". So z.\,B. openoffice/LibreOffice welches in der
> ersten Zeile die Kodierung und dann die Pattern braucht.
>
> Diese Programme (bzw. Leute die unsere Pattern für diese Programme
> verarbeiten) können ggf. von einer Umverteilung profitieren.
In the process I have learned that Open/LibreOffice requires some
further changes of TeX patterns (called "compression" or so) to work
properly. So OpenOffice patterns always work in TeX, but the reverse
is not necessarily true.
Anyway:
- the patterns for <whatever>Office would need a modification anyway
(the first line at least)
- your patterns end up converted in a more "user-friendly" format in
hyph-utf8 as well
> * Das Paket hyph-utf8 nimmt ebenfalle eine Neuzusammenstellung der
> dehyph-exptl Dateien vor und enthält dann:
>
> patterns/txt/hyph-de-19*.chr.txt % Zeichenvorat
> patterns/txt/hyph-de-19*.hyp.txt % leer
Hyphenation exceptions. Missing for German.
> patterns/txt/hyph-de-19*.lic.txt % Lizenz
> patterns/txt/hyph-de-19*.pat.txt % reine Pattern
There's another difference there. We add Unicode apostrophe to the
patterns in addition to '. But it's not relevant for German at the
moment.
> patterns/tex/hyph-de-19*.tex % das, was bei uns dehyph*.pat heißt!
>
> loadhyph/loadhyph-de-19*.tex % funktional äquivalent zu dehyph*.tex!
There is also patterns/ptex/hyph-de-19*.ec.tex. (I might change "ptex"
into something else one day.)
These patterns work in all 8-bit engines (including pdfTeX), but are
currently only used in pTeX which is unable to do the conversion on
the fly.
> Eine Umverteilung unsererseits sollte abgesprochen werden.
>
> Können wir für die "8bit/Unicode-Weiche" den Kode von hyph-utf8 nehmen?
Sure.
You can of course use
\input conv-utf8-ec.tex
which would be the easiest (hyph-utf8 should be installed by default anyway).
Or you can copy (and rename) the file in case that it suits you better that way.
Just don't ask me to add "long s" to the EC encoding ;)
Btw: I was thinking of dropping the auto-conversion (in loadhyph-*)
from UTF-8. Now that we have 8-bit patterns anyway, we could just as
well just load "8-bit" patterns directly (those originally created for
pTeX). In any case we would would leave the converters in the
distribution.
Mojca
Mehr Informationen über die Mailingliste Trennmuster