Unicode, Windows, MediaCtrl, and QuickTime

Hello,

I continue to work on improving the i18n for my video analysis software. I recently discovered what I'm pretty sure is a wxPython or wxWidgets bug.

On Windows, the wx.MediaCtrl cannot load files that have non-cp1252 PATHS if the QuickTime back end is used.

To be more specific, the file "E:\Vidëo\亲亳 亲\Test-MOV.mov" will not load in the wx.MediaCtrl() because of the Chinese characters in the PATH portion of the file name. However, the file "E:\Vidëo\Unicödê\亲亳亲-AnalysisMOV.mov" loads fine because the PATH is cp1252 even though there's Chinese in the file name itself. The MPEG-1 versions of both files load fine on Windows because we're not using the QuickTime back end, and both QuickTime files load fine on OS X, just not Windows. (I don't have the capacity to test this on Linux.)

So my question is, should I report this as a bug? If so, to wxPython or to wxWidgets?

Here's my work-around code, if anyone's interested: (self.movie is a wx.MediaCtrl, and filename is the file name with full path.)

    # First, detect Windows and QuickTime
    if ('wxMSW' in wx.PlatformInfo) and (self.backend == wx.media.MEDIABACKEND_QUICKTIME):
        # Change the Python Encoding to cp1252
        wx.SetDefaultPyEncoding('cp1252')
        # Get the Current Working Directory
        originalCWD = os.getcwd()
        # Divide the path and the file name from each other
        (currentPath, currentFileName) = os.path.split(filename)
        # Change the Current Directory to the file's location
        os.chdir(currentPath)
        # Encode just the File Name portion of the path
        tmpfilename = currentFileName.encode(sys.getfilesystemencoding())
    # If we're not on Windows OR we aren't using QuickTime ...
    else:
        # ... then the unencoded file name including the full path works just fine!
        tmpfilename = filename

    # Try to load the file in the media player.
    if self.movie.Load(tmpfilename):

        <snip>

    # Again, detect Windows and QuickTime
    if ('wxMSW' in wx.PlatformInfo) and (self.backend == wx.media.MEDIABACKEND_QUICKTIME):
        # Reset the Default Python encoding back to UTF8
        wx.SetDefaultPyEncoding('utf8')
        # Reset the Current Working Directory to what it used to be
        os.chdir(originalCWD)

Thanks,

David

David K. Woods, Ph.D.
Researcher, Transana Lead Developer
Wisconsin Center for Education Research
University of Wisconsin, Madison
http://www.transana.org

Hello,

I continue to work on improving the i18n for my video analysis software. I recently discovered what I'm pretty sure is a wxPython or wxWidgets bug.

On Windows, the wx.MediaCtrl cannot load files that have non-cp1252 PATHS if the QuickTime back end is used.

To be more specific, the file "E:\Vidëo\亲亳 亲\Test-MOV.mov" will not load in the wx.MediaCtrl() because of the Chinese characters in the PATH portion of the file name. However, the file "E:\Vidëo\Unicödê\亲亳亲-AnalysisMOV.mov" loads fine because the PATH is cp1252 even though there's Chinese in the file name itself. The MPEG-1 versions of both files load fine on Windows because we're not using the QuickTime back end, and both QuickTime files load fine on OS X, just not Windows. (I don't have the capacity to test this on Linux.)

What happens if you pass a Unicode object to Load() instead of an encoded string? Setting wxPython's default encoding basically just sets what it will use to convert Python strings to wxString unicode values, so if you convert it to unicode first I expect that you will not have to do the switching of the default encoding back and forth.

So my question is, should I report this as a bug? If so, to wxPython or to wxWidgets?

I'm not sure yet.

···

On 3/14/12 10:59 AM, David Woods wrote:

--
Robin Dunn
Software Craftsman

> Hello,
>
> I continue to work on improving the i18n for my video analysis
> software. I recently discovered what I'm pretty sure is a
wxPython or
> wxWidgets bug.
>
> On Windows, the wx.MediaCtrl cannot load files that have non-cp1252
> PATHS if the QuickTime back end is used.
>
> To be more specific, the file "E:\Vidëo\亲亳 亲\Test-MOV.mov" will not
> load in the wx.MediaCtrl() because of the Chinese characters in the
> PATH portion of the file name. However, the file
> "E:\Vidëo\Unicödê\亲亳亲-AnalysisMOV.mov" loads fine because
the PATH is
> cp1252 even though there's Chinese in the file name itself. The
> MPEG-1 versions of both files load fine on Windows because
we're not
> using the QuickTime back end, and both QuickTime files load
fine on OS
> X, just not Windows. (I don't have the capacity to test this on
> Linux.)
>

What happens if you pass a Unicode object to Load() instead of an
encoded string? Setting wxPython's default encoding
basically just sets
what it will use to convert Python strings to wxString
unicode values,
so if you convert it to unicode first I expect that you will
not have to
do the switching of the default encoding back and forth.

filename IS a unicode object until I encode it as tmpfilename in my workaround. Load() fails. It fails whether I change the default encoding or not.

> So my question is, should I report this as a bug? If so,
to wxPython
> or to wxWidgets?

I'm not sure yet.

Then I'm really glad I asked. Thanks for looking into this.

I've attached my testing playground app, if that's any help.

David

Video_Test_Minimal.py (3.39 KB)

Hi David,

Could you send me a zip file (or other archive format) containing folders and video file names matching your testing app so I can make sure that I set up my testing environment the same as yours?

Thanks,

···

On 3/14/12 12:02 PM, David wrote:

Hello,

I continue to work on improving the i18n for my video analysis
software. I recently discovered what I'm pretty sure is a

wxPython or

wxWidgets bug.

On Windows, the wx.MediaCtrl cannot load files that have non-cp1252
PATHS if the QuickTime back end is used.

To be more specific, the file "E:\Vidëo\亲亳 亲\Test-MOV.mov" will not
load in the wx.MediaCtrl() because of the Chinese characters in the
PATH portion of the file name. However, the file
"E:\Vidëo\Unicödê\亲亳亲-AnalysisMOV.mov" loads fine because

the PATH is

cp1252 even though there's Chinese in the file name itself. The
MPEG-1 versions of both files load fine on Windows because

we're not

using the QuickTime back end, and both QuickTime files load

fine on OS

X, just not Windows. (I don't have the capacity to test this on
Linux.)

What happens if you pass a Unicode object to Load() instead of an
encoded string? Setting wxPython's default encoding
basically just sets
what it will use to convert Python strings to wxString
unicode values,
so if you convert it to unicode first I expect that you will
not have to
do the switching of the default encoding back and forth.

filename IS a unicode object until I encode it as tmpfilename in my workaround. Load() fails. It fails whether I change the default encoding or not.

So my question is, should I report this as a bug? If so,

to wxPython

or to wxWidgets?

I'm not sure yet.

Then I'm really glad I asked. Thanks for looking into this.

I've attached my testing playground app, if that's any help.

--
Robin Dunn
Software Craftsman

>>> Hello,
>>>
>>> I continue to work on improving the i18n for my video analysis
>>> software. I recently discovered what I'm pretty sure is a
>> wxPython or
>>> wxWidgets bug.
>>>
>>> On Windows, the wx.MediaCtrl cannot load files that have
non-cp1252
>>> PATHS if the QuickTime back end is used.
>>>
>>> To be more specific, the file "E:\Vidëo\亲亳
亲\Test-MOV.mov" will not
>>> load in the wx.MediaCtrl() because of the Chinese
characters in the
>>> PATH portion of the file name. However, the file
>>> "E:\Vidëo\Unicödê\亲亳亲-AnalysisMOV.mov" loads fine because
>> the PATH is
>>> cp1252 even though there's Chinese in the file name itself. The
>>> MPEG-1 versions of both files load fine on Windows because
>> we're not
>>> using the QuickTime back end, and both QuickTime files load
>> fine on OS
>>> X, just not Windows. (I don't have the capacity to test this on
>>> Linux.)
>>>
>>
>> What happens if you pass a Unicode object to Load() instead of an
>> encoded string? Setting wxPython's default encoding
basically just
>> sets what it will use to convert Python strings to wxString
>> unicode values,
>> so if you convert it to unicode first I expect that you will
>> not have to
>> do the switching of the default encoding back and forth.
>
> filename IS a unicode object until I encode it as tmpfilename in my
> workaround. Load() fails. It fails whether I change the default
> encoding or not.
>
>>> So my question is, should I report this as a bug? If so,
>> to wxPython
>>> or to wxWidgets?
>>
>> I'm not sure yet.
>
> Then I'm really glad I asked. Thanks for looking into this.
>
> I've attached my testing playground app, if that's any help.

Hi David,

Could you send me a zip file (or other archive format) containing
folders and video file names matching your testing app so I can make
sure that I set up my testing environment the same as yours?

Hi Robin,

Thanks so much for your time.

First, go to Transana.com | Qualitative Analysis Software For Researchers and look for the download link with your name on it. You can download the necessary test files from there. It's 500 MB, so please let me know when you have got it. My server guys would prefer I didn't leave large files like this up for longer than I need to.

I'm running Python 2.6.6 and wxPython 2.8.12.1 on Windows XP and OS X 10.5.8.

The download file Robin.zip contains a new copy of the Video_Test_Minimal.py file that's a bit cleaner than the earlier one and that points to the files I've included (which I gave simpler names to in the lastest example.) There are also two directories, Unicödê and 亲亳 亲, each with a pair of *.mpg files and a pair of *.mov files. (Yes, it's 4 copies of each of two files, so very inefficient for downloading, but renaming files to Chinese can be a pain!)

What I do is have the files and directories in a shared folder called Robin on my Windows computer and use Connect to Server on my Mac to connect to that folder from my Mac. You can install the files wherever works for you, and then set your local paths as the videoRoot variable on lines 15 (Windows) and 18 (OS X) in Video_Test_Minimal.py.

On line 21, you can set whether you're looking at the *.mov files or the *.mpg files.

On line 22, set the showProblem variable to True to see the issue I'm talking about, or to False to see that my work-around actually works. (Yes, another example of how creating a minimal example can help you solve a problem you're having.)

So if you are running on Windows AND have the fileExtension set to u'.mov' AND have showProblem set to True, you will see that two of the *.mov files load and two of the *.mov files, the ones in the directory that has Chinese characters, fail to load.

If you're on OS X, OR if you have fileExtension set to u'.mpg' OR you have showProblem set to False, you should see four video files playing simultaneously, if not synchronously.

Sorry it takes so much work to demonstrate this problem. Again, thanks for looking into it. And thanks for your work on wxPython. It makes my life easier every day.

David

David K. Woods, Ph.D.
Researcher, Transana Lead Developer
Wisconsin Center for Education Research
University of Wisconsin, Madison
http://www.transana.org