Re: [frogs] [patch] convert-ly cannot deal with accented characters

[ Thread Index | Date Index | More lilynet.net/frogs Archives ]


On Thu, Mar 4, 2010 at 5:41 PM, Graham Percival
<graham@xxxxxxxxxxxxxxxxx> wrote:
> I expect my 1st year students to go "duh, it doesn't work".  You can
> do better than them.

Come on, I'm but a pianist :)

The problem comes from

-    f.write (s.encode (f.encoding or 'utf_8'))

since a) the f object *doesn't have* an "encoding" property, and
  b) the s object cannot be encoded without having been decoded first.


Hence, we need to decode it so we can re-encode it. Here comes another
problem: decode it from *what* encoding exactly? By default, Python
tries to read it as an ascii string, which fails when it contains
accented chars.

On Windows, that would be Cp1252. On Mac OSX, that would be utf-8, and
on GNU/Linux it can be anything from ISO8859-n to UTF-8.

Therefore, I'm using the global locale variable to detect the encoding
(this variable is set on all operating systems). example: en,utf8.
That requires me to import the locale module.

I need the second part of the variable (hence the [1]).

+    f.write (s
+      .decode (sys.stderr.encoding or locale.getdefaultlocale()[1])
+      .encode (f.encoding or 'utf_8'))

Last time I checked, it worked. Turns out it doesn't. Ergo: I'm gonna
give up and play some Chopin instead.

Cheers,
Valentin

---
----
Join the Frogs!


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/