[Trennmuster] Hyphenation pattern for compounds with joining hyphen
Pander
pander at users.sourceforge.net
Sa Jul 16 16:33:47 CEST 2022
On 7/15/22 18:49, Keno Wehr wrote:
> Am 15.07.22 um 16:21 schrieb Pander:
>> Hi all,
>>
>> What would be the hyphenation pattern for trennmuster for compounds
>> with a joining hyphen https://en.wikipedia.org/wiki/Hyphen#Joining
>>
>> Examples are Calmette-Guérin
>> https://github.com/toddy15/medicalterms/blob/main/dicts/namen-x-de_all-medicalterms.txt#L26
>>
>> and Alanin-Aminotransferase
>> https://github.com/toddy15/medicalterms/blob/main/dicts/medizin-xx-de_all-medicalterms.txt#L322
>>
>>
>>
>> Is there a way to escape such hyphen or is there another way to
>> distinguish such a hyphen from a hyphenation point?
>
> Please describe more precisely what you want to do, I'm not quite sure
> what you mean.
I am working on updating the hyphenation patterns for Dutch for TeX,
libhyphen, etc. The last time they were updated was end of 1996 after a
spelling change. In 2006, there was again a spelling change. The old
patterns do not support them and are missing a lot of new words. Our
collection for word list and spelling checker has grown with at least
100,000 words since then. We have also more hyphenation patterns but in
2007 was the last time somebody worked on this experimentally and that
is unfortunately undocumented and abandoned.
The Trennmuster project I know for years and have made some minor
contributions in the past. Also on the advice of the creator of the
Frysian hyphenation patterns, the approach of Trennmuster would be good
for the new hyphenation patterns for Dutch as we have identical
transcription changes when hyphenating.
Unlike German, Dutch has many compounds with a joining hyphen, e.g.
zwart-witfotografie, toe-eigenen, normalisatie-instellingen and
opera-uitvoeringen. These hyphens have priority for hyphenation over all
other hyphenation locations. Preferred hyphenations is
..... ..... normalisatie-
instellingen
but when absolutely needed, also is possible
..... ..... ..... norma-
lisatie-instellingen
.... normalisatie-instel-
lingen
I did not find any examples in the wordlist file, so I searched for an
example German word that fits my problem and was not yet in wordlist,
hence Alanin-Aminotransferase with possible hyphenations being
..... ..... Alanin-
Aminotransferase
..... ..... ..... Ala-
nin-Aminotransferase
..... Alanin-Amino-
transferase
What would the hyphenation pattern according to the wordlist file be?
Alanin-Aminotransferase;A·la-nin{-/==}A·mi-no<trans<fer=a-se
reasoning that the first part of the {/} is the non-hyphenated writing.
That was my line of thought. With your info below, it would be?
Alanin;A·la-nin
Aminotransferase;A·mi-no<trans<fer=a-se
and then use in TeX
Alanin"=Aminotransferase
Unfortunately, regular Dutch has a lot of these words so I need to have
them in the pattern file (for LibreOffice, Firefox, Chrome) and cannot
use Babel's "= for that. Looking forward to your thoughts on this.
(Hoping Trennmuster will support this edge case.)
Thanks,
Pander
PS Is this centered dot at the beginning of the line a typo?
https://repo.or.cz/wortliste.git/blob/HEAD:/zusatzlisten/arzneiwirkstoffnamen-supplement#l221
·Angustifolium;An·gu·sti·fo·li·um
>
> Note that it's TeX's default for words with explicit hyphens to allow
> line breaks only after the explicit hyphens. Normally, no hyphenation
> patterns will apply to these words.
> If you use the Babel shorthand "= instead of the explicit hyphen, the
> regular patterns will apply to the word parts before and after. This
> is usually fine, but our patterns are intended for "regular" German,
> technical terminology is covered only partly. The terms you state
> above are very special (the first one is French rather than German)
> and may lead to wrong hyphenation points with our patterns.
>
> All the best,
> Keno
> _______________________________________________
> Trennmuster mailing list
> Trennmuster at dante.de
> https://lists.dante.de/mailman/listinfo/trennmuster
Mehr Informationen über die Mailingliste Trennmuster