[Trennmuster] Railroad diagram first complete version
Pander
pander at users.sourceforge.net
Fr Nov 15 17:13:04 CET 2013
On 15-11-13 14:56, Werner LEMBERG wrote:
>> Please review the first complete railroad diagram of the hyphenation
>> format found in the attachment. It should cover the complete file
>> wortliste.
> This looks very nice graphically! However, I'm not happy with the
> contents. Here are my comments.
>
> . Hyphen:
>
> - Neither the order nor the number of the elements are not fixed.
> In other words, something like `=|' is also valid, meaning the
> same as `|='.
Will change it.
> - The number of elements is not limited (except `·' which can occur
> only once). Theoretically, something like `-----' could also
> happen.
So no upper limit at all? In Dutch we have a word of 50 characters
aansprakelijkheidswaardevaststellingsveranderingen
and that woud be, for example
aan|spra-ke-lijk--heids==waar-de=vast|stel-lings===ver-an-de-ring-en
So five hyphens would be very rare. But if the format should have no
upper limit, I will adjust it.
>
> - The `_' element is currently undocumented and subject to change.
I will remove it
> . MorphemeAltSpelling:
>
> You must use the class `Hyphen' instead of listing the possible
> hyphenation characters explicitly.
OK
>
> . MorphemeMultiHyphenation:
>
> It's a *very* bad idea to list the possible values explicitly, since
> it neither covers future additions nor recursion correctly.
First I created a version that was generic but the patterns were very
complex. There I chose to do it like this for now to double check what
it should support. Is already in the planning to make it generic.
>
> . Char:
>
> Similarly, it's a bad idea to list the possible values explicitly.
> Instead, I would use the Unicode Latin ranges.
For the international standard I will do that and can already support it
here too.
> My conclusion is that this grammar is just an ad-hoc solution,
> unfortunately, not correctly representing the `wortliste'.
It is work in progress. It does represent wortlist as ti is nog but
should indeed, as you pointed out earlier, be more generic.
Thanks for reviewing this version.
>
>> Does anyone know a script that can validate wortliste against this
>> definition?
> I don't know such a script.
>
>
> Werner
Mehr Informationen über die Mailingliste Trennmuster