Commit graph

228 commits

Author SHA1 Message Date
acli
09ab9d4769 Interim update 2004-02-23 06:05:56 +00:00
acli
b6f552e6e2 Interim update 2004-02-23 05:56:19 +00:00
acli
d03a71a5e2 More bugs that prevented some strings from being translated properly.
This time it's a trimming bug.
2004-02-23 05:51:30 +00:00
acli
dda8e7d233 Off-by-one bug 2004-02-23 04:36:56 +00:00
acli
ae4bf41171 Minor dialect correction 2004-02-23 04:29:28 +00:00
acli
dae8ab184b Bug that prevented msgid's with French characters from being translated
should now be really fixed.
2004-02-23 04:26:04 +00:00
acli
56d4a4d0ba Charset "translation" line 2004-02-23 04:19:24 +00:00
acli
ec6562b7d3 Oops, forgot to take out some debugging print statements 2004-02-23 04:02:06 +00:00
acli
3e140b7053 More interim updates 2004-02-23 04:00:38 +00:00
acli
422739c80d Interim update 2004-02-23 03:15:01 +00:00
acli
77a1d8682d Fold all consecutive whitespaces into single blanks. This avoids problems
when minor whitespace changes occur in the original templates; it also
makes the strings much easier to read (e.g., instead of "foo\n\n\t\t  bar",
xgettext.pl will now always generate "foo bar" and tmpl_process3.pl will
understand it to be the same as the original string).
2004-02-23 01:21:03 +00:00
acli
10a00d1b50 Preliminary support for "analysis" of strings with <a> tags.
Early termination of analysis if we encounter some strings, such as </h1>
or | or ||, in order to avoid extracting strings that are unnecessarily
long and which doesn't add any meaningful context.
2004-02-22 21:34:40 +00:00
acli
03695ce811 Try to relax the criteria for allowing groups of tokens without TMPL_VAR
to be combined together into one string. This seems to have the desired
effect (that "<b>foo</b> bar" type strings are now recognized in one piece).

However, "<h1>foo</h1>\nexplanation"-type things may now also be (arguably
wrongly) recognized as one piece.
2004-02-22 09:04:53 +00:00
acli
9268d4e11c The French character handling fix for tmpl_process3 was not checked in
for some reason.

Try to remove trailing ( in strings too.
2004-02-22 08:18:27 +00:00
acli
b7150bb0c3 Ugly hack to get rid of the close tag in pathetic "foo %s</h1>"-like strings 2004-02-22 07:00:16 +00:00
acli
fb1cfd3dd3 Templates with French characters were not handled properly in the install
step. This is now fixed.
2004-02-22 06:46:15 +00:00
acli
b2138f5d0d Handle the iso8859-1 charset somewhat, so that when the po file is in
either iso8859-1 or utf8, msgmerge(1) won't crap out. The code is ugly;
the conversion table is hard-coded, and in some place not very appropriate.

However, this does fix the case where a few strings containing French
characters can't be translated. As a side effect, tmpl_process3 can now
also be used for French or other languages using iso8859-1.
2004-02-22 05:18:52 +00:00
acli
5cc08f652b Updates 2004-02-20 09:32:14 +00:00
acli
0f1c4df62a Fixed bug where a <textarea...>#cdata</textarea> on one line won't be
scanned properly.
2004-02-20 07:52:32 +00:00
acli
12ce5c292f Minor updates 2004-02-20 07:25:38 +00:00
acli
3101a3b414 Minor update (after changing TmplTokenizer.pm) 2004-02-20 07:13:21 +00:00
acli
257b26d141 Partially allow combination of several TEXT tokens. It seems that this
gives better strings. (Always allowing combinations gives havoc, we
currently avoid this by allowing combination only if the first and last
tokens are both TEXT.)
2004-02-20 07:09:47 +00:00
acli
feb6e56449 Updates 2004-02-20 07:04:10 +00:00
acli
96534eac9a Preliminary checkin 2004-02-20 04:38:36 +00:00
acli
b6c37e376e Support %0.0s notation so that we can omit the %s as in Year%s for the
Chinese translation. (This won't work for all languages; ultimately the
English templates must be fixed.)
2004-02-20 04:38:02 +00:00
acli
0d4f569ff3 Try to not display like 40-line warnings too often 2004-02-20 02:48:39 +00:00
acli
793f49ec7f Escape ISO8859-1 characters. msgmerge still hates these strings, but at
least the po file merges.
2004-02-20 00:39:26 +00:00
acli
14a62cc0c4 Forgot to check for fuzzy-ness. 2004-02-19 21:28:14 +00:00
acli
8b57901d85 New scripts for translation into Chinese and other languages where English
word order is too different than the word order of the target language to
yield meaningful translations.

The new scripts use a different translation file format (namely standard
gettext-style PO files).

This seems to reasonably work (e.g., producing an empty en_GB translation
then installing seems to not corrupt the "translated" files), but it likely
will still contain some bugs. There is also little documentation, but try
to run perldoc on the .p[lm] files to see what's there. There are also some
spurious warnings (both from bugs in the new scripts and from buggy third-
party Locale::PO module).
2004-02-19 21:24:30 +00:00
acli
053bb685ab Warn against Apache #include directive 2004-02-18 06:56:19 +00:00
acli
7be0c493d9 Updated w.r.t. the text-extract2.pl filter. 2004-02-18 06:39:34 +00:00
acli
6e1a824374 The previous change was wrong. 2004-02-17 07:45:17 +00:00
acli
a9edbfe34c Allow trim to return the trimmed whitespace if the caller wants them. 2004-02-17 07:26:29 +00:00
acli
b318d2b8e3 Don't extract strings from the VALUE attributes of RADIO type INPUT fields;
these aren't translatable.
2004-02-17 06:30:38 +00:00
acli
4d2463c34a Insert the filename of the token into the TmplToken object too 2004-02-17 05:42:27 +00:00
acli
39dc31c2c9 Converted TmplTokenizer into a class. Everything still seems ok, but it is
not tested thoroughly.
2004-02-17 05:07:04 +00:00
acli
ae87eee049 Still more bugfixes for my own bugs.
$readahead is now an array @readahead which can contain TmplToken objects,
so "ungetting" tokens should not disturb the line number counter any more.
2004-02-17 03:17:48 +00:00
acli
c1e51c54d5 Fixed more bugs during the modularization 2004-02-17 03:02:39 +00:00
acli
09c348bd9c Further breaking up of the TmplTokenizer module.
A couple of minor fixes.
2004-02-17 02:45:27 +00:00
acli
2f7192689a Avoid direct accessing of variables inside the module 2004-02-16 23:50:56 +00:00
acli
0b6030aecd Some functions should not be in the module; these are now removed. 2004-02-16 23:46:34 +00:00
acli
59d2e35180 Pulled the tokenizer out into a module. Hope this has been done right. 2004-02-16 23:42:57 +00:00
acli
de8d0930ee Minor factoring of construction of warning messages. 2004-02-16 22:50:34 +00:00
acli
2a9be2b2e6 Don't bother warning about TMPL_VAR if the key is onclick, onblur, etc.
We don't know how to warn/what to suggest, & that will only confuse people
2004-02-14 09:50:11 +00:00
acli
1d45c47c02 Fix spurious warnings if attribute is in the form foo="bar"</TMPL_IF> 2004-02-14 09:41:28 +00:00
acli
f7b649f41b Make a reasonable suggestion for ESCAPE= if we warn about lack of it 2004-02-14 09:33:09 +00:00
acli
3fd0a52e0a Fixed spurious warning about unescaped < inside cdata 2004-02-14 09:23:34 +00:00
acli
050e1995d9 Minor change to make the "closed start tag" warning more understandable 2004-02-14 09:10:20 +00:00
acli
ce2189ef37 Don't complain about strange attribute syntax if what we see is a
reasonable templating control flow directive (if, else, unless).
2004-02-14 08:49:21 +00:00
acli
524a76f1b3 Have to make it know what "closed start tag" notation is; other it spews
out more than a screenful or text for an "unknown token" when such notation
is seen
2004-02-14 08:03:02 +00:00