Conflict between wxFileDialog and wxProcess in py2apped program on OS X

Hi all,

I'm using Python 2.6.7, wxPython 2.9.4.0, and py2app 0.7.3 on OS X 10.7.5. However, I have reason to believe the same issues occurs with Python 2.6.6, wxPython 2.8.12.0, and bundlebuilder.

In my application, I use wxProcess to call FFMpeg to do media conversion. I recently noticed that once I have opened a wxFileDialog, my FFMpeg call stops working with Unicode file names, that is, those with accented characters. I found this to be true when I had used py2app to package my application into a distributable bundle, but not when I am running from source code. And only on OS X, not Windows.

I have created a radically simplified sample app to demonstrate the underlying problem.

Test.py, the main program, shows two buttons. One calls up a test window (described next) and the other simply opens a wxFileDialog. This wxFileDialog doesn't DO anything, but it doesn't need to.

MediaConvertTest.py shows a TextCtrl with unicode text sample in it and automatically calls a very simple OS call using wxProcess. This call simply passes the unicode text sample to the OS's "echo" command and reads the result. The unicode text sample is encoded to utf-8 during the wxProcess call.

setup_TEST.py is a py2app setup script for packaging test.py. "python setup_TEST.py py2app", of course.

Build and run test.app on OS X. (The problem doesn't occur on Windows.) Press the Open Test Window button. You see that echo is called, and what is returned is a string of length 9 with the utf-8-encoded string representation of the unicode test sample. This is exactly what I expect.

Close the Test Window. Now press the Open a File Dialog button and press Cancel on the File Dialog. We've done nothing but open a wxFileDialog and closed it again.

Now press Open Test Window again. It, of course, does exactly the same thing it did before. But this time, it returns a string of length 8, and the unicode character in our unicode text sample is NOT encoded using utf8. We get character 246, which is the correct character for the unencoded unicode character, what we get with unichr(246), instead of the utf8-encoded character combination chr(195) + chr(182) for that character. I want to re-iterate, we have a STRING object here, not a unicode object, with an unencoded unicode character in it. Even though I explicitly encode the unicode sample text to a utf-8 string as part of the wxProcess call.

You can run from source code and you won't see this. You can run the test repeatedly without opening the File Dialog and you won't see this. You will only see this when you've built an app bundle and have opened a wxFileDialog. So using the wxFileDialog in a bundled app somehow alters the way character encoding is handled, at least when using wxProcess to interact with the operating system. Encoding the text sample before the process call doesn't change the results.

For me, the practical implication is that I can't access my FFMpeg program to convert files with unicode characters in the file names any more once a wxFileDialog call has been made somewhere in my program.

Can anyone suggest a work-around?

Thanks for your time,

David

Test.zip (4.44 KB)

David Woods wrote:

Build and run test.app on OS X. (The problem doesn't occur on
Windows.)

This may be the first time in history that someone has reported a
Unicode-related problem that did NOT involve Windows...

···

--
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

Build and run test.app on OS X. (The problem doesn't occur on
Windows.)

This may be the first time in history that someone has reported a
Unicode-related problem that did NOT involve Windows...

Admittedly, it wasn't easy to isolate this problem.

David

David Woods wrote:

You can run from source code and you won't see this. You can run the
test repeatedly without opening the File Dialog and you won't see this.
You will only see this when you've built an app bundle and have opened a
wxFileDialog. So using the wxFileDialog in a bundled app somehow alters
the way character encoding is handled, at least when using wxProcess to
interact with the operating system. Encoding the text sample before the
process call doesn't change the results.

For me, the practical implication is that I can't access my FFMpeg
program to convert files with unicode characters in the file names any
more once a wxFileDialog call has been made somewhere in my program.

Can anyone suggest a work-around?

Please try it with the preview build at:

     http://wxpython.kosoftworks.com/preview/20130318/

I'm using that build and your sample works fine for me after commenting out a couple missing things in setup_TEST.py. My results are pasted below. The other differences with my test is that I'm using Python 2.7 instead of 2.6, (which shouldn't matter for this use case) and OSX 10.8.4, (which might.)

I have a dim recollection of a fix being committed that had something to do with the file dialog and locale settings. If my memory is correct then that may have been a fix for the problem you are seeing with this case.

Encoding Information:
   defaultPyEncoding: utf_8, filesystemencoding: utf-8

Media Filename:
Demö.mpg (<type 'unicode'>)
Process call:
"echo" "Demö.mpg"

stream.CanRead() call successful. (<type 'str'>)

00 068 'D'
01 101 'e'
02 109 'm'
03 195 'Ã'
04 182 '¶'
05 046 '.'
06 109 'm'
07 112 'p'
08 103 'g'
09 Demö.mpg (<type 'str'>)

00 (<type 'str'>)

···

--
Robin Dunn
Software Craftsman

You can run from source code and you won’t see this. You can run the

test repeatedly without opening the File Dialog and you won’t see this.

You will only see this when you’ve built an app bundle and have opened a

wxFileDialog. So using the wxFileDialog in a bundled app somehow alters

the way character encoding is handled, at least when using wxProcess to

interact with the operating system. Encoding the text sample before the

process call doesn’t change the results.

For me, the practical implication is that I can’t access my FFMpeg

program to convert files with unicode characters in the file names any

more once a wxFileDialog call has been made somewhere in my program.

Can anyone suggest a work-around?

Please try it with the preview build at:

 [http://wxpython.kosoftworks.com/preview/20130318/](http://wxpython.kosoftworks.com/preview/20130318/)

I’m using that build and your sample works fine for me after commenting
out a couple missing things in setup_TEST.py. My results are pasted
below. The other differences with my test is that I’m using Python 2.7
instead of 2.6, (which shouldn’t matter for this use case) and OSX
10.8.4, (which might.)

I have a dim recollection of a fix being committed that had something to
do with the file dialog and locale settings. If my memory is correct
then that may have been a fix for the problem you are seeing with this case.

Hi Robin,

Thanks for looking into this.

If I run my test program without opening a FileDialog, I get what you got. But I ran my built test with the preview version of wx, and after opening the FileDialog, here’s what I got:

wx version: 2.9.5.0.b20130318
Encoding Information:
defaultPyEncoding: utf_8, filesystemencoding: utf-8

Media Filename:
Demö.mpg (<type ‘unicode’>)
Process call:
“echo” “Demö.mpg” (<type ‘str’> - <type ‘unicode’>)

00 068 ‘D’
01 101 ‘e’
02 109 ‘m’
03 195 ‘Ã’
04 182 ‘¶’
05 046 ‘.’
06 109 ‘m’
07 112 ‘p’
08 103 ‘g’
09 Demö.mpg (<type ‘str’>)

stream.CanRead() call successful. (<type ‘str’>)

00 068 ‘D’
01 101 ‘e’
02 109 ‘m’
03 246 ‘ö’
04 046 ‘.’
05 109 ‘m’
06 112 ‘p’
07 103 ‘g’
EXCEPTION RAISED:
<type ‘exceptions.UnicodeDecodeError’>
‘utf8’ codec can’t decode byte 0xf6 in position 7: invalid start byte

00 (<type ‘str’>)

So the problem still exists for me. My newer version of the test program shows the utf-8 encoded string at the top that I send to wxProcess and shows the invalid, unicodified “string” that I get back from the wxProcess call second. What I believe, based on what happens with FFMpeg, is that my program is sending this mis-encoded string to the program called in wxProcess regardless of how I encode it.

I’ll track down a newer Mac and run my test on that computer. That’ll address the OS X version difference.

I’ll also look into the locale setting. I looked at defaultPyEncoding and sys.getfilesystemencoding(), but I didn’t look at locale.

I’ll also mess with my setup.py file to see if that could be making a difference, although I can’t drop the “ditto” call because of the way I embed the database engine.

I’ll let you know what I find.

David

We have a winner!
The problem does not occur on OS X 10.8.4 with wxPython 2.9.4.0 or
wxPython 2.9.5.0. It does occur with OS X 10.7.5 with both wxPython
versions.
So … I simply explain to my users that if they want to use file
names that include characters in languages other than English, then
they have to upgrade their OS X version? That’s gonna go over
well. That probably applies to more than half of my OS X users.
Any better ideas? Anyone? David

···

On 06/21/2013 12:53 PM, David wrote:

  > You can run from source code and you won't see

this. You can run the

    > test

repeatedly without opening the File Dialog and you won’t see
this.

    > You will only see this when you've built an app bundle and

have opened a

    > wxFileDialog. So using the wxFileDialog in a bundled app

somehow alters

    > the way character encoding is handled, at least when using

wxProcess to

    > interact with the operating system. Encoding the text

sample before the

    > process call doesn't change the results.


    >


    > For me, the practical implication is that I can't access my

FFMpeg

    > program to convert files with unicode characters in the

file names any

    > more once a wxFileDialog call has been made somewhere in my

program.

    >


    > Can anyone suggest a work-around?




    Please try it with the preview build at:




         [http://wxpython.kosoftworks.com/preview/20130318/](http://wxpython.kosoftworks.com/preview/20130318/)




    I'm using that build and your sample works fine for me after

commenting

    out a couple missing things in setup_TEST.py. My results are

pasted

    below. The other differences with my test is that I'm using

Python 2.7

    instead of 2.6, (which shouldn't matter for this use case) and

OSX

    10.8.4, (which might.)




    I have a dim recollection of a fix being committed that had

something to

    do with the file dialog and locale settings.  If my memory is

correct

    then that may have been a fix for the problem you are seeing

with this case.

Hi Robin,

    Thanks for looking into this. 



    If I run my test program without opening a FileDialog, I get

what you got. But I ran my built test with the preview version
of wx, and after opening the FileDialog, here’s what I got:

    wx version:  2.9.5.0.b20130318

    Encoding Information:

      defaultPyEncoding: utf_8, filesystemencoding: utf-8



    Media Filename:

    Demö.mpg (<type 'unicode'>)

    Process call:

    "echo" "Demö.mpg" (<type 'str'> - <type 'unicode'>)



    00    068    'D'

    01    101    'e'

    02    109    'm'

    03    195    'Ã'

    04    182    '¶'

    05    046    '.'

    06    109    'm'

    07    112    'p'

    08    103    'g'

    09  Demö.mpg  (<type 'str'>)



    stream.CanRead() call successful. (<type 'str'>)



    00    068    'D'

    01    101    'e'

    02    109    'm'

    03    246    'ö'

    04    046    '.'

    05    109    'm'

    06    112    'p'

    07    103    'g'

    EXCEPTION RAISED:

    <type 'exceptions.UnicodeDecodeError'>

    'utf8' codec can't decode byte 0xf6 in position 7: invalid start

byte

    00    (<type 'str'>)



    So the problem still exists for me.  My newer version of the

test program shows the utf-8 encoded string at the top that I
send to wxProcess and shows the invalid, unicodified “string”
that I get back from the wxProcess call second. What I believe,
based on what happens with FFMpeg, is that my program is sending
this mis-encoded string to the program called in wxProcess
regardless of how I encode it.

    I'll track down a newer Mac and run my test on that computer. 

That’ll address the OS X version difference.

    I'll also look into the locale setting.  I looked at

defaultPyEncoding and sys.getfilesystemencoding(), but I didn’t
look at locale.

    I'll also mess with my setup.py file to see if that could be

making a difference, although I can’t drop the “ditto” call
because of the way I embed the database engine.

    I'll let you know what I find.

You can run from source code and you won’t see this. You can
run the

Test_Solved.zip (4.61 KB)

···
      >

test repeatedly without opening the File Dialog and you won’t
see this.

      > You will only see this when you've built an app bundle

and have opened a

      > wxFileDialog. So using the wxFileDialog in a bundled app

somehow alters

      > the way character encoding is handled, at least when

using wxProcess to

      > interact with the operating system. Encoding the text

sample before the

      > process call doesn't change the results.

      >

      > For me, the practical implication is that I can't access

my FFMpeg

      > program to convert files with unicode characters in the

file names any

      > more once a wxFileDialog call has been made somewhere in

my program.

      >

      > Can anyone suggest a work-around?



      Please try it with the preview build at:



           [http://wxpython.kosoftworks.com/preview/20130318/](http://wxpython.kosoftworks.com/preview/20130318/)




      I'm using that build and your sample works fine for me after

commenting

      out a couple missing things in setup_TEST.py. My results are

pasted

      below. The other differences with my test is that I'm using

Python 2.7

      instead of 2.6, (which shouldn't matter for this use case) and

OSX

      10.8.4, (which might.)



      I have a dim recollection of a fix being committed that had

something to

      do with the file dialog and locale settings.  If my memory is

correct

      then that may have been a fix for the problem you are seeing

with this case.

Hi Robin,

      Thanks for looking into this. 



      If I run my test program without opening a FileDialog, I get

what you got. But I ran my built test with the preview
version of wx, and after opening the FileDialog, here’s what I
got:

      wx version:  2.9.5.0.b20130318

      Encoding Information:

        defaultPyEncoding: utf_8, filesystemencoding: utf-8



      Media Filename:

      Demö.mpg (<type 'unicode'>)

      Process call:

      "echo" "Demö.mpg" (<type 'str'> - <type

‘unicode’>)

      00    068    'D'

      01    101    'e'

      02    109    'm'

      03    195    'Ã'

      04    182    '¶'

      05    046    '.'

      06    109    'm'

      07    112    'p'

      08    103    'g'

      09  Demö.mpg  (<type 'str'>)



      stream.CanRead() call successful. (<type 'str'>)



      00    068    'D'

      01    101    'e'

      02    109    'm'

      03    246    'ö'

      04    046    '.'

      05    109    'm'

      06    112    'p'

      07    103    'g'

      EXCEPTION RAISED:

      <type 'exceptions.UnicodeDecodeError'>

      'utf8' codec can't decode byte 0xf6 in position 7: invalid

start byte

      00    (<type 'str'>)



      So the problem still exists for me.  My newer version of the

test program shows the utf-8 encoded string at the top that I
send to wxProcess and shows the invalid, unicodified “string”
that I get back from the wxProcess call second. What I
believe, based on what happens with FFMpeg, is that my program
is sending this mis-encoded string to the program called in
wxProcess regardless of how I encode it.

      I'll track down a newer Mac and run my test on that computer. 

That’ll address the OS X version difference.

      I'll also look into the locale setting.  I looked at

defaultPyEncoding and sys.getfilesystemencoding(), but I
didn’t look at locale.

      I'll also mess with my setup.py file to see if that could be

making a difference, although I can’t drop the “ditto” call
because of the way I embed the database engine.

      I'll let you know what I find.
  We have a winner!

  The problem does not occur on OS X 10.8.4 with wxPython 2.9.4.0 or

wxPython 2.9.5.0. It does occur with OS X 10.7.5 with both
wxPython versions.

  So ... I simply explain to my users that if they want to use file

names that include characters in languages other than English,
then they have to upgrade their OS X version? That’s gonna go
over well. That probably applies to more than half of my OS X
users.

  Any better ideas?  Anyone? 

  David

  We have another winner!!!

  I've found that the wxLocale object doesn't REPORT anything

different when the problem I’ve been chasing is occurring, BUT if
I simply reset the Locale using the same language setting, the
problem disappears in OS X 10.7.5.

  I've attached the updated source code for my sample app.  All I

really did was add a wxLocale at the beginning of the app and then
add a line in MediaConvertTest.py that updates the locale based on
the locale set at the main program level.

  Robin, please note that this problem is still present in your

wxPython 2.9.5.0 pre-release on OS X 10.7.5. So if the problem
that you referred to earlier was fixed, that fix somehow isn’t
making it into your latest build. Your comment about that problem
was crucial to helping me sort this out, so thanks much for that
hint.

  Now I'm off to implement this simple fix in my full program.  I

sure hope it works full-scale.

  Thanks much for your help with this.

  David

David Woods wrote:

We have another winner!!!

I've found that the wxLocale object doesn't REPORT anything different
when the problem I've been chasing is occurring, BUT if I simply reset
the Locale using the same language setting, the problem disappears in OS
X 10.7.5.

Interesting...

I've attached the updated source code for my sample app. All I really
did was add a wxLocale at the beginning of the app and then add a line
in MediaConvertTest.py that updates the locale based on the locale set
at the main program level.

Robin, please note that this problem is still present in your wxPython
2.9.5.0 pre-release on OS X 10.7.5. So if the problem that you referred
to earlier was fixed, that fix somehow isn't making it into your latest
build. Your comment about that problem was crucial to helping me sort
this out, so thanks much for that hint.

TBH I'm not sure if the change I was thinking about is the same thing, or even if it was made before or after the latest preview build, so just about anything is possible. :slight_smile:

···

--
Robin Dunn
Software Craftsman