[Trennmuster] Kommentare zu Sanders IETF-Vorschlag
Werner LEMBERG
wl at gnu.org
Mo Jan 13 09:02:58 CET 2014
Hier meine Kommentare.
Werner
======================================================================
> Please provide first feedback (content, spelling, grammar, examples,
> examples in French, comments on examples, references,
> acknowledgements) as soon as you can.
Here my comments, referencing to the superscript numbers.
3 s/upper case/uppercase/
4 Why are you explicitly referencing to Unicode standard 6.1.0? The
current version is 6.3, and 7.0 will appear soon.
4 I suggest that you replace
A Unicode code point can be recognised by a capital U, followed
by a plus sign and followed by a hexadecimal number from one to
five positions. Usually, two or four positions are being used.
with
A Unicode code point can be recognised by a capital U, followed
by a plus sign and followed by four to six hexadecimal digits.
Usually, four or five digits are being used.
U+XXXX and U+XXXXX are the standard forms. U+XXXXXX also exists,
since the highest possible Unicode value is U+10FFFF.
I've never seen U+XX before – or maybe this is IETF speak? It's
definitely *not* Unicode speak, cf. Appendix A, `Notational
Conventions', in the Unicode book. I further suggest that you
extend U+XX to U+00XX in the whole document.
6 s/choosing a reserved characters/choosing reserved characters/
s/normally considered/normally considered as/
s/aims to offers/aims to offer/
7 s/sentemce/sentence/
15 s/library with libhyphen/with libhyphen/
41 The next paragraph after this one is missing a starting
superscript ID...
43 s/White space/Whitespace/ (in the whole document) – this is a
technical term and not written as two words, AFAIK.
54 You should completely skip that paragraph. It's an unnecessary
restriction, contains a lot of hearsay, and is influenced by
Western hyphenation tradition. As a hypothetical example, imagine
large, square Maya glyphs, where hyphenation is indicated by a
small dot somewhere.
It's even wrong for Western typography, since it is possible to
slightly shift the hyphenation character to the right, outside of
the justified text block (similar to other punctuation
characters). Cf. the article `Hanging Punctuation',
http://www.ntg.nl/maps/25/12.pdf.
55 Should be removed, for the same reasons as 54.
56 Should be removed, for the same reasons as 54.
57 The last character in EBNF form is wrong: s/x10FFF/x10FFFF/. Note
that in the range U+10000-U+10FFFF there are more noncharacters,
eg. U+1FFFE. See section 16.7, `Noncharacters', in the Unicode
standard.
58 s/but when non is proved/but when none is provided/
63 s/uncommen/uncommon/
Why is the Greek example commented out and split into two linese?
64 s/becasue/because/
The English hyphenation examples are US English ones, AFAICS, and
there are *huge* differences to British English hyphenation. You
should explicitly state that.
67 The form «foo» is Swiss. In Germany and Austria, it's rather
»foo«. I would use a different representation for English
translations.
68 s/These are '«' and '»'.These help understanding/
These are '«' and '»'. These help understand/
70 If you explicitly mention German, you should give an example (or a
non-example). Otherwise I would skip that remark.
71 s/left to right and gaining priority/
left to right, gaining priority/
s/has greater priority over/has greater priority than/
72 I suggest that you put the long non-English words into displays to
avoid accidental, incorrect hyphenation in the reader's browser.
Additionally, non-Germanic languages like Hungarian also allow
such concatenations. Here's a possible reformulation.
Many languages can concatenate words to form long compounds,
real-life examples from Western languages are
Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz
(German) and
aansprakelijkheidswaardevaststellingsveranderingen
(Dutch), respectively. [...]
If you look at the Sanskrit example from
https://en.wikipedia.org/wiki/Longest_words (shown with artificial
hyphens!) the German word is dwarf, BTW.
80 Some text is missing here :-)
84 In the example, you are using `.', which not yet defined.
90 Maybe you should mention that `ck -> k-k' is a rule from the old
German orthography.
91 Ditto.
94 s/counter example/counterexample/
95 The classical English example is the word `record'. Knuth shows
the following in the TeXbook, Appendix H, `Hyphenation':
The committee skeptically re-
commended more study for a bill
to require warning labels on rec-
ords with subliminal messages re-
corded backward.
— THE PENINSULA TIMES TRIBUNE (April 28, 1982)
97 s/The use of a nesting a changing cluster/
The use of nested changing clusters/
In the English example, you should use `wales' for the
non-topographic word, not `Wales'. Additionally, this is not a
genetive but a plural of the word `wale'.
This leads to an interesting question, namely how uppercase and
lowercase influence hyphenation, and how uppercase and lowercase
should be represented in the word list. This is not discussed at
all.
102 s/should be filtered out of fixed/should be filtered out/
[TRENNMP] s/Trennmuster Projekt/Trennmuster/
[APPENDIX A] s/hyphenation definitions format/
format of hyphenation definitions/
Mehr Informationen über die Mailingliste Trennmuster