I’d like to ask probably a really trivial question about some encoding peculiarities in wx PyShell. I’m still not quite sure, how the unicode characters should be entered here - I use python 2.5.2. wxpython 2.8.8.0 unicode on Win XPh, Czech localisation. Some of my experimenting in PyShell included the following prints:
Strangely, the more complex thing - combining characters from several alphabets works just fine (let’s hope, there won’t be problems with displaying these …); however a trivial task, I’d need much more often, seems to be somhow tricky.
A common Czech character such as č is handled correctly as a string, but not as an unicode literal; using the standard windows Central-European encoding cp1250 the character is interpreted as expected.
I’d like to ask about the first example in the prompt above - Is there maybe some kind of heuristics, which analyses the content of the string in order to display it correctly? (It seems, that in this case, utf-8 was chosen:
print unicode(“ašžťčüξжאى”, “utf-8”)
ašžťčüξжאى
)
I guess, I must be missing something quite trivial here, I couldn’t find out, why the mentioned characters are treated differently, as they are all simply typed or pasted in the shell.
(sys.getdefaultencoding() shows: ‘ascii’ as I haven’t messed with these global settings sofar)
Could somone maybe advice, how to work with the unicode character in the same way, without the need to distinguish the ones from the national codepage from the others?
wx.GetDefaultPyEncoding() shows ‘cp1250’, hence it seems likely, that the encoding used here would be either this or ‘utf-8’, I thought, the behaviour would be rather deterministic and didn’t expected such differences.
Of course, the results are different in other shells: windows cmd console is ok with print u"č" as well as print “č” but shows questionmarks for all characters beyond the national codepage. Idle works similar to PyShell regarding print u"č" >> è, but also throws “Unsupported characters in input” on the “foreign” characters (despite being able to display them correctly after pasting).
That’s why I am using PyShell (& co.) as this for me the most usable python shell I 've tried sofar
It seems, that after calling wx.SetDefaultPyEncoding(“utf-8”) the following prints work all ok:
print unicode(“ašžťčüξжא”, “utf-8”), unicode(“č”, “utf-8”), “ašžťčüξжא”, “č” (also in separate prints)
however the maybe most natural way: print u"ašžťčüξжא", u"č" is messed up: ašžťÄÃ¼Î¾Ð¶× Ä
Besides, I’m not sure what else could be modified with this global setting; it seems more secure to differentiate between cp1250 and utf-8, which can be expected on my current system.
I’d like to ask probably a really trivial question about some encoding peculiarities in wx PyShell. …
print u"ašžťčüξжאى"
ašžťčüξжאى
print u"č"
è
print “č”
č
print unicode(“č”, “windows-1250”)
č
…
I guess, I must be missing something quite trivial here, I couldn’t find out, why the mentioned characters are treated differently, as they are all simply typed or pasted in the shell.
(sys.getdefaultencoding() shows: ‘ascii’ as I haven’t messed with these global settings sofar)
Could somone maybe advice, how to work with the unicode character in the same way, without the need to distinguish the ones from the national codepage from the others?
What does wx.GetDefaultPyEncoding() show?
Keep in mind that PyCrust is a wx app too, and so it is treating what is pasted, typed, printed etc. as unicode and/or utf-8 automatically, or it is at least making some assumptions about it based on the runtime environment. In other words, it’s not quite the same here as compared to executing python code normally from a source file.
You’ll also probably have different results for these tests if you were to try them in a plain console/terminal window.
I don’t think there are any shortcuts that are 100% safe. The best thing to do is if you deal with strings then either have a way to know what the encoding is so you can safely convert it to unicode objects as needed, or make the assumption that the string is encoded in the default encoding for the locale and be prepared to catch the decode exceptions if it is not.
Thank you for your kindly reply. I wanted to enter in a more detailed
discussion until I read this:
If I try to run the sourcefiles on my unicode build, it seems to work
(however the latest version psi85 has probably some problems with the casing
the module names and doesn't run on my computer , e.g.:
import psiframefrom psiglobal import Glb
psi84-py252-wxpy2871ansi is ok
import psiFrameimport psiGlobal)...Trying this second last version, I don't
see major problems at first glance; what is supposed to cause problems on
unicode builds (except maybe some highlighting glitches)?
I'm very disturbed by this. From psi85, I decided to use a more "Pythonic"
convention, lower case names for Python modules and capitalized names for
classes. I achieved this by just renaming the files on my w2k box (fat32).
I did not notice any problem on *my* platform after that change. The
py2exe'ifed version of psi85 has been tested on a win XP box (fat32) and
on vista (NTFS).
Has anybody an explanation? Will it possible to have some more feed back from
other users?
Vlastimil, did you have the same issues with the exe version of psi85?
I have been maintaining this application for now 6 years, releasing a tested
version for every new wxPython.
So I decided to step away from the "program" for a bit, and wanted to fool around with the icons, and
bitmaps... I had ready somewhere that you can embed bitmaps into code. So researched, and found the
img2py.
Anyhow, I made my own 16x16 png icons, and dumped the ones I got that were free to usem and embedded mine into the program I'm working on... I don't know why I'm so impressed with it. I guess it took one more "complication" out of trying to distribute the program once it's all done.
Hats off to the wxPython "Think-tank."
Which leads me to another question I just thought of. I ready somewhere about stock graphics in wxPython.
I see all the nifty things in the demo for wxPython, but it seems wxPython's demo is still missing some it's features. (I did say "seems..." It may not be missing anything.) Anyhow is there a "central' source to find out
the things like img2py and what not?
Oh last question for now... how come the pre defined colors in Python (wx.RED, wx.WHITE etc...) are missing other colors like wx.YELLOW? I know you can use colors by the RGB value and hex values.
I'm just curious.