Unicode/py2exe/GUI2exe question, (wx)Python recommendend practice

Hello,

what is the proper way to handle Unicode/UTF-8 and wxPython? What will
be the proper way in the future (Python3/wxPython Phoenix)? As an
example see the following file:

When running the following .py-File (encoded in UTF-8 with strings
prefixed with and without u'')
http://pastebin.com/2wVAUzfU
in different ways, different results are achieved (running on python
2.7.2/wxPython 2.8.12.1-unicode) on WindowsXP SP3 with Luna theme (the
one with the slightly rounded buttons and colored edge).

1) Running the Code from Eclipse/PyDev

German Umlaute/chinese characters are displayed properly in all cases
(regardless of string prefix)
The XP theme is presented properly

2) Running the .py-File from cmd or by double-clicking in the Explorer

Only the u''-prefixed strings are shown properly, the not prefixed
strings are shown as two garbled characters
The XP theme is presented properly

3) Creating an exe with GUI2exe/py2exe *without* XP Manifest

like 2), and the Button Theme is now Windows Standard (like Win2000/
Server 2003 with grey border

4) Creating an exe with GUI2exe *with* Include XP Manifest checked

The generated .exe crashes (sorry, german: "Die Anwendung konnte nicht
initialisiert werden (0xc0000142)..."

5) Replacing the manifest:

According to Werner Bruhin/Cody Precord the manifest in GUI2exe
Constants.py
http://code.google.com/p/gui2exe/source/browse/trunk/Constants.py
can be replaced by the one from
http://wiki.wxpython.org/py2exe-python26

Result: EXE works, optical result like 2)

Questions:

Do I have now to prefix all strings containing special characters with
u'' additionally to having specified
# -*- coding: utf-8 -*-
in the file header?

How has it to be done in the future in Python3/Phoenix?

With thanks in advance,
nepix

not really a wxPython question, but a python one -- if you want
non-ansi charcactors you need unicode object, not py2 strings.

In py3, all strings are unicode objects.

In py2, a literal with a u"" is a unicode object, one with a plain ""
is a py2string, which is a py3 bytes object.

There are two way to write compatible code:

1) py3 syntax:

in python2.7:
from __future__ import unicode_literals

at the top of your file will essentially add a "u" to the beginnng of
the each string, so all literal string are unicode object, jsut like
py3k

2) py2 syntax:

put a u"" on ALL your literals. the latest py3k, 3.3, accepts this u
(ignoring it) for backward compatibility.

I'd go with option 1) -- it's the way of the future.

NOTE: the
# -*- coding: utf-8 -*-
in the file header specified the encoding of the source file itself --
it has no effect on how the syntax of the code is interpreted.

-Chris

···

On Tue, Feb 12, 2013 at 3:27 PM, nepix32 <nepix32@gmail.com> wrote:

How has it to be done in the future in Python3/Phoenix?

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

not really a wxPython question, but a python one -- if you want
non-ansi charcactors you need unicode object, not py2 strings.

[...]

in python2.7:
from __future__ import unicode_literals

First thanks for clarifying that this is a unicode/python issue and
has nothing to do with wxPython. It seems that your proposal 1) is the
sensible thing to do.

A) By inserting
print sys.getdefaultencoding()
into the demo file and running it from Eclipse/PyDev (seems to have an
unicode-aware console included) gives
uft-8
as output and shows *all* characters properly (also in print
statement). Neither solution 1) and 2) are required

B) When doing the same on a windows cmd line (with Codepage cp1252)
gives the output
ascii
and the non-ascii characters are garbled in wxPython. When this is the
case, the only way to get wxPython display the characters properly is
to use solution 1) and/or 2)

This was counter-intuitive to me that the encoding of the console
determines how str literals are fed into wxPython. In my case A) the
str is interpreted as utf-8 by wxPython, in case B) it is interpreted
as ascii.

So for me the wisdom from this is:
Never use characters in str/unicode in your source file which the
target systems console cannot display. Otherwise a decoding error is
risked when printing a "string" to the console.

With best regards,
nepix

···

On Feb 13, 1:09 am, Chris Barker - NOAA Federal <chris.bar...@noaa.gov> wrote:

Hi,

not really a wxPython question, but a python one -- if you want
non-ansi charcactors you need unicode object, not py2 strings.

[...]

in python2.7:
from __future__ import unicode_literals

First thanks for clarifying that this is a unicode/python issue and
has nothing to do with wxPython. It seems that your proposal 1) is the
sensible thing to do.

A) By inserting
print sys.getdefaultencoding()
into the demo file and running it from Eclipse/PyDev (seems to have an
unicode-aware console included) gives
uft-8
as output and shows *all* characters properly (also in print
statement). Neither solution 1) and 2) are required

B) When doing the same on a windows cmd line (with Codepage cp1252)
gives the output
ascii
and the non-ascii characters are garbled in wxPython. When this is the
case, the only way to get wxPython display the characters properly is
to use solution 1) and/or 2)

This was counter-intuitive to me that the encoding of the console
determines how str literals are fed into wxPython. In my case A) the
str is interpreted as utf-8 by wxPython, in case B) it is interpreted
as ascii.

So for me the wisdom from this is:
Never use characters in str/unicode in your source file which the
target systems console cannot display. Otherwise a decoding error is
risked when printing a "string" to the console.

Py2.7 and earlier default encoding is ASCII - PyDev must do something, maybe they use something similar I use in my app startup script:

# people say one should leave this alone and use decode/encode, or define this
# in sitecustomize.py
# either of them don't really work for me, so as long as the following does
# this is what I will do until I switch to Py 3.x
if hasattr(sys, "frozen"): #Py2Exe does not run Site.py
     sys.setdefaultencoding('utf-8')
     del sys.setdefaultencoding
else:
     #The Python interpreter needs to reload the function
     # save/restore the excepthook, otherwise WingIDE won't see some exceptions
     hook = sys.excepthook
     reload(sys)
     sys.setdefaultencoding('utf-8')
     del sys.setdefaultencoding
     sys.excepthook = hook

Werner

···

On 13/02/2013 10:06, nepix32 wrote:

On Feb 13, 1:09 am, Chris Barker - NOAA Federal > <chris.bar...@noaa.gov> wrote:

[make console encoding utf-8]

if hasattr(sys, "frozen"): #Py2Exe does not run Site.py
sys.setdefaultencoding('utf-8')
del sys.setdefaultencoding
else:
#The Python interpreter needs to reload the function
# save/restore the excepthook, otherwise WingIDE won't see some
exceptions
hook = sys.excepthook
reload(sys)
sys.setdefaultencoding('utf-8')
del sys.setdefaultencoding
sys.excepthook = hook

I tried this and it changes the console encoding to 'utf-8'. However,
the captions for my frame are still presented as str to wxPython and
it still displays garbled characters (UTF-8 as ascii). So this does
not work for me. Example on pastebin:
http://pastebin.com/g8EchLDU

PyDev uses a more complex script for setting encoding, see
https://github.com/aptana/Pydev/blob/development/plugins/org.python.pydev/pysrc/pydev_sitecustomize/sitecustomize.py
however, I do not get it how this works.

Thanks,
nepix

···

On Feb 13, 10:23 am, Werner <werner.bru...@sfr.fr> wrote:

[make console encoding utf-8]

if hasattr(sys, "frozen"): #Py2Exe does not run Site.py
      sys.setdefaultencoding('utf-8')
      del sys.setdefaultencoding
else:
      #The Python interpreter needs to reload the function
      # save/restore the excepthook, otherwise WingIDE won't see some
exceptions
      hook = sys.excepthook
      reload(sys)
      sys.setdefaultencoding('utf-8')
      del sys.setdefaultencoding
      sys.excepthook = hook

I tried this and it changes the console encoding to 'utf-8'. However,
the captions for my frame are still presented as str to wxPython and
it still displays garbled characters (UTF-8 as ascii). So this does
not work for me. Example on pastebin:
http://pastebin.com/g8EchLDU

with the above you still need to use u"" or the the "unicode_literals" import as suggested by Chris, if I uncomment that line it works for me in wingIDE.

PyDev uses a more complex script for setting encoding, see
https://github.com/aptana/Pydev/blob/development/plugins/org.python.pydev/pysrc/pydev_sitecustomize/sitecustomize.py
however, I do not get it how this works.

They do a lot more then encoding, but the encoding bit is basically the above but done in sitecustomize.py which is not usable with py2exe.

Werner

···

On 13/02/2013 11:15, nepix32 wrote:

On Feb 13, 10:23 am, Werner <werner.bru...@sfr.fr> wrote: