wxTextDataObject returns utf-16le encoded text

ckkart · June 20, 2020, 4:27pm

I found out that the data obtained from a wxTextDataObject is utf-16le encoded (wxpython 4.1, python3, OSX). Is that on purpose? I was used to that wxpython works everywhere with unicode strings. Here is a short example:

    import wx

    obj = wx.TextDataObject('Jährchen Ⅷ')
    s = obj.GetDataSize()
    b = bytearray(s)
    obj.GetDataHere(b)
    print(b.decode('utf-16le'))

    Jährchen Ⅷ

Robin · June 25, 2020, 7:04pm

I’m not completely sure but I suspect that it is probably due to whatever the native interchange format is for the platform. Using GetDataHere is always going to give you the raw bytes that were sent from the drop source or from the clipboard no matter what the actual type is. But if you instead use obj.GetText then wxWidgets should translate whatever the platform gives you into Unicode.

ckkart · June 25, 2020, 8:15pm

I see. I was dealing with a plain wx.DataObject returned by calling GetDataObject() on a EVT_DATAVIEW_ITEM_DROP event (on OSX/wxpython4.1). Anyway in the meanwhile I noticed that on GTK3 GetDataObject() always returns None. So, looking at the wxwidgets dataview sample I learned that the correct way to retrieve the TextDataObject is

        obj = wx.TextDataObject()
        obj.SetData(wx.DataFormat(wx.DF_UNICODETEXT), evt.GetDataBuffer())
        print(obj.GetText())

which works on both platforms.

Sorry for the noise.