[Trennmuster] Railroad diagram first complete version
Stephan Hennig
mailing_list at arcor.de
So Nov 17 14:46:08 CET 2013
Am 14.11.2013 21:17, schrieb Pander:
> Please review the first complete railroad diagram of the hyphenation
> format found in the attachment. It should cover the complete file wortliste.
Looking at the files, I realize that there are language specific rules
that I don't think they should be covered in the EBNF grammar, but would
be nice to have checked when validating a particular file containing
words in/for a particular language.
As an example, non-standard hyphenation in German language is restricted to
{ck/k-k}
{bb/bb=b}
{ff/ff=f}
etc.
But I don't think the structure of these rules are general enough to be
fixed in the grammar. That is, MorphemeAltSpelling should simply be two
'Words' with optional leading or trailing 'Hyphens' (separated by a
slash and enclosed in a pair of braces).
BTW, the latter three characters could become their own identifiers,
too, so that when, e.g., changing the separator character, the
MorphemeAltSpelling rule doesn't need to be touched.
Shouldn't MorphemeMultiHyphenation be referenced in the Morpheme rule?
BTW, I'd avoid the term Morpheme, since a series of characters isn't
necessarily a morpheme. In
A{ck/k-k}er
neither 'A', nor '{ck/k-k}', nor 'er' is a morpheme. In script
skripte/parse_wortliste.lua I used the term 'cluster'. Along these lines
A 'word' is a series of 'clusters' separated by 'hyphens'.
A 'cluster' is a series of 'fundamental clusters'.
A 'fundamental cluster' is
- a series of characters or
- a 'non-standard hyphenation' expression {} or
- an 'alternative expression' [].
> Does anyone know a script that can validate wortliste against this
> definition?
Don't know.
Best regards,
Stephan Hennig
Mehr Informationen über die Mailingliste Trennmuster