[Trennmuster] pattern and character analysis

Pander pander at users.sourceforge.net
Di Nov 5 22:11:19 CET 2013


Hi all,

I would like to contribute the following analysis to this project. It is
an analysis of the characters and reserved characters used in the German
hyphenation pattern definitions.

With it I was already able to get a small typo in one of the patterns
fixed. It is a convenient tool to spot errors and get statistics on the
patterns. As a bonus, you can use histogram-wortzeichen.png to win with
Galgenmännchen.

You could even use these results to improve
https://de.wikipedia.org/wiki/Buchstabenh%C3%A4ufigkeit A similar
analysis of character frequency in Dutch using the same GNUplot scripts
was published https://nl.wikipedia.org/wiki/Letterfrequentie and
http://opentaal.org/het-laatste-nieuws/171-karakterfrequentie

Werner has already reviewed my contribition and send me corrections
which I have fixed. Could one of you review the attached file as well
and when all is OK add it to the GIT repo in wortliste/skripte/python/
please?

Best regards,

Pander
-------------- nächster Teil --------------
Ein Dateianhang mit Binärdaten wurde abgetrennt...
Dateiname   : histogramm.tar.bz2
Dateityp    : application/x-bzip
Dateigröße  : 101435 bytes
Beschreibung: nicht verfügbar
URL         : <https://listi.jpberlin.de/pipermail/trennmuster/attachments/20131105/1e5666f3/attachment.bz2>


Mehr Informationen über die Mailingliste Trennmuster