[Trennmuster] Railroad diagram first complete version

Pander pander at users.sourceforge.net
Fr Nov 15 17:13:04 CET 2013


On 15-11-13 14:56, Werner LEMBERG wrote:
>> Please review the first complete railroad diagram of the hyphenation
>> format found in the attachment. It should cover the complete file
>> wortliste.
> This looks very nice graphically!  However, I'm not happy with the
> contents.  Here are my comments.
>
> . Hyphen:
>
>   - Neither the order nor the number of the elements are not fixed.
>     In other words, something like `=|' is also valid, meaning the
>     same as `|='.
Will change it.
>   - The number of elements is not limited (except `·' which can occur
>     only once).  Theoretically, something like `-----' could also
>     happen.
So no upper limit at all? In Dutch we have a word of 50 characters
  aansprakelijkheidswaardevaststellingsveranderingen
and that woud be, for example
  aan|spra-ke-lijk--heids==waar-de=vast|stel-lings===ver-an-de-ring-en
So five hyphens would be very rare. But if the format should have no
upper limit, I will adjust it.
>
>   - The `_' element is currently undocumented and subject to change.
I will remove it
> . MorphemeAltSpelling:
>
>   You must use the class `Hyphen' instead of listing the possible
>   hyphenation characters explicitly.
OK
>
> . MorphemeMultiHyphenation:
>
>   It's a *very* bad idea to list the possible values explicitly, since
>   it neither covers future additions nor recursion correctly.
First I created a version that was generic but the patterns were very
complex. There I chose to do it like this for now to double check what
it should support. Is already in the planning to make it generic.
>
> . Char:
>
>   Similarly, it's a bad idea to list the possible values explicitly.
>   Instead, I would use the Unicode Latin ranges.
For the international standard I will do that and can already support it
here too.
> My conclusion is that this grammar is just an ad-hoc solution,
> unfortunately, not correctly representing the `wortliste'.
It is work in progress. It does represent wortlist as ti is nog but
should indeed, as you pointed out earlier, be more generic.

Thanks for reviewing this version.
>
>> Does anyone know a script that can validate wortliste against this
>> definition?
> I don't know such a script.
>
>
>     Werner




Mehr Informationen über die Mailingliste Trennmuster