[Trennmuster] Kommentare zu Sanders IETF-Vorschlag

Werner LEMBERG wl at gnu.org
Mo Jan 13 09:02:58 CET 2014



Hier meine Kommentare.


    Werner


======================================================================


> Please provide first feedback (content, spelling, grammar, examples,
> examples in French, comments on examples, references,
> acknowledgements) as soon as you can.

Here my comments, referencing to the superscript numbers.

  3 s/upper case/uppercase/

  4 Why are you explicitly referencing to Unicode standard 6.1.0?  The
    current version is 6.3, and 7.0 will appear soon.

  4 I suggest that you replace

      A Unicode code point can be recognised by a capital U, followed
      by a plus sign and followed by a hexadecimal number from one to
      five positions.  Usually, two or four positions are being used.

    with

      A Unicode code point can be recognised by a capital U, followed
      by a plus sign and followed by four to six hexadecimal digits.
      Usually, four or five digits are being used.

    U+XXXX and U+XXXXX are the standard forms.  U+XXXXXX also exists,
    since the highest possible Unicode value is U+10FFFF.

    I've never seen U+XX before – or maybe this is IETF speak?  It's
    definitely *not* Unicode speak, cf. Appendix A, `Notational
    Conventions', in the Unicode book.  I further suggest that you
    extend U+XX to U+00XX in the whole document.

  6 s/choosing a reserved characters/choosing reserved characters/
    s/normally considered/normally considered as/
    s/aims to offers/aims to offer/

  7 s/sentemce/sentence/

 15 s/library with libhyphen/with libhyphen/

 41 The next paragraph after this one is missing a starting
    superscript ID...

 43 s/White space/Whitespace/ (in the whole document) – this is a
    technical term and not written as two words, AFAIK.

 54 You should completely skip that paragraph.  It's an unnecessary
    restriction, contains a lot of hearsay, and is influenced by
    Western hyphenation tradition.  As a hypothetical example, imagine
    large, square Maya glyphs, where hyphenation is indicated by a
    small dot somewhere.

    It's even wrong for Western typography, since it is possible to
    slightly shift the hyphenation character to the right, outside of
    the justified text block (similar to other punctuation
    characters).  Cf. the article `Hanging Punctuation',
    http://www.ntg.nl/maps/25/12.pdf.

 55 Should be removed, for the same reasons as 54.

 56 Should be removed, for the same reasons as 54.

 57 The last character in EBNF form is wrong: s/x10FFF/x10FFFF/.  Note
    that in the range U+10000-U+10FFFF there are more noncharacters,
    eg. U+1FFFE.  See section 16.7, `Noncharacters', in the Unicode
    standard.

 58 s/but when non is proved/but when none is provided/

 63 s/uncommen/uncommon/

    Why is the Greek example commented out and split into two linese?

 64 s/becasue/because/

    The English hyphenation examples are US English ones, AFAICS, and
    there are *huge* differences to British English hyphenation.  You
    should explicitly state that.

 67 The form «foo» is Swiss.  In Germany and Austria, it's rather
    »foo«.  I would use a different representation for English
    translations.

 68 s/These are '«' and '»'.These help understanding/
      These are '«' and '»'. These help understand/

 70 If you explicitly mention German, you should give an example (or a
    non-example).  Otherwise I would skip that remark.

 71 s/left to right and gaining priority/
      left to right, gaining priority/

    s/has greater priority over/has greater priority than/

 72 I suggest that you put the long non-English words into displays to
    avoid accidental, incorrect hyphenation in the reader's browser.
    Additionally, non-Germanic languages like Hungarian also allow
    such concatenations.  Here's a possible reformulation.

      Many languages can concatenate words to form long compounds,
      real-life examples from Western languages are

        Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz

      (German) and

        aansprakelijkheidswaardevaststellingsveranderingen

      (Dutch), respectively.  [...]

    If you look at the Sanskrit example from
    https://en.wikipedia.org/wiki/Longest_words (shown with artificial
    hyphens!) the German word is dwarf, BTW.

 80 Some text is missing here :-)

 84 In the example, you are using `.', which not yet defined.

 90 Maybe you should mention that `ck -> k-k' is a rule from the old
    German orthography.

 91 Ditto.

 94 s/counter example/counterexample/

 95 The classical English example is the word `record'.  Knuth shows
    the following in the TeXbook, Appendix H, `Hyphenation':

                                The committee skeptically re-
                              commended more study for a bill
                            to require warning labels on rec-
                            ords with subliminal messages re-
                                             corded backward.

               — THE PENINSULA TIMES TRIBUNE (April 28, 1982)

 97 s/The use of a nesting a changing cluster/
      The use of nested changing clusters/

    In the English example, you should use `wales' for the
    non-topographic word, not `Wales'.  Additionally, this is not a
    genetive but a plural of the word `wale'.

    This leads to an interesting question, namely how uppercase and
    lowercase influence hyphenation, and how uppercase and lowercase
    should be represented in the word list.  This is not discussed at
    all.

102 s/should be filtered out of fixed/should be filtered out/

[TRENNMP]  s/Trennmuster Projekt/Trennmuster/

[APPENDIX A] s/hyphenation definitions format/
               format of hyphenation definitions/


Mehr Informationen über die Mailingliste Trennmuster