Hi All,
Chris Mellon wrote:
>> Christopher Barker wrote:
>>
>> To make that clear with your example:
>>
>> 2. "Right" way:
>> msg = u"Could be a fatal string"
>> wx.MessageBox( msg, "", wx.OK )
>>
>> so pass only unicode objects to wxPython.
>>
>> Where did your "could be a fatal string" come from? when you get that is
>> when you should handle the decoding, as Chris (the other one) said.
>>
>> ----
>>
>> Basically you are right. But you forget, in my mind, one important effect
>> which is coming from the Python side. Python "speaks" better iso-8859-1 than
>> cp1252 (win platform). This is especially true for the str <--> unicode
>> conversions.
>>
>
> Sorry, but this just isn't true. Python "speaks" both of them perfectly well.
>
>> If one works in a pure <str>-type mode, eg cp1252, on a win platform, using the
>> wxPython ansi build, then there is a proper cp1252-ANSI mapping and it avoids
>> some annoying side effects.
>>
>
> "str" is a sequence of bytes. The default conversion to use when
> converting to unicode is ascii,Not always. Python's default can be changed from the site.py file. And
for automatic conversions done in wxPython (passing a string to a
wxString parameter in a Unicode build, or passing a Unicode object in a
ansi build) then if sys.getdefaultencoding() is still "ascii" then
wxPython will use locale.getdefaultlocale()[1] for the encoding
conversions. Doing it this way means that when the programmer needs to
deal with strings that are not strictly ascii then most of the time
wxPython will Do The Right Thing with the conversion because it will use
the current system locale's default encoding.For the curious here is the actual code for deciding what encoding to use:
default = _sys.getdefaultencoding()
if default == 'ascii':
import locale
import codecs
try:
if hasattr(locale, 'getpreferredencoding'):
default = locale.getpreferredencoding()
else:
default = locale.getdefaultlocale()[1]
codecs.lookup(default)
except (ValueError, LookupError, TypeError):
default = _sys.getdefaultencoding()
del locale
del codecs
if default:
wx.SetDefaultPyEncoding(default)
del defaultYou can find out what conversion encoding wxPython is using with
wx.GetDefaultPyEncoding, and you can change it if you want with
wx.SetDefaultPyEncoding.
Now that we are close to abandon ansi builds (as far as I understood,
which makes me less than happy anyway), there are a couple of things
that astonish me a bit, while being ironically sad (or sadly ironic?):
- It is very difficult (impossible?) to setup an encoding which will
support *all* the possible characters in the known world languages. I
used utf-8 for GUI2Exe but I remember I read it could fail anyway in
some occasions (but my memory could fail here);
- It should be enough to put something like:
# -*- coding: utf-8 -*-
At the beginning of a script to force
Python/wxPython/numpy/matplotlib/whatever site-package you want to
transparently encode/decode everything without the developer
intervention. If I wish to distribute my application in China, Russia
or Germany, my opinion is that I should not waste more than an eye
blink time to think about encodings.
All this stuff about sys.getdefaultencoding(),
wx.GetDefaultPyEncoding(), # -*- coding: whatever -*-,
locale.set_locale(), codecs, BOM, is extremely confusing if you are
not a Python guru (which I am not). There are many resources on the
web to read about it, but sometimes they just help in increasing the
confusion.
I am curious to see what will happen when I'll start moving my
database-based app to unicode (remembering the GUI2Exe encoding
nightmare, God Save Andrea)
Andrea.
"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.alice.it/infinity77/
···
On Dec 20, 2007 7:52 PM, Robin Dunn wrote:
> On Dec 20, 2007 4:39 AM, jmf <jfauth@bluewin.ch> wrote: