Bruce Who wrote:
======= 2005-04-19 01:36:32 Robin Dunn wrote: =======
Bruce Who wrote:
Robin Dunn:
I've been toying around with the idea today to add methods to wxSTC to
allow setting and getting the text as utf-8 and bypass the conversion
to/from wxString (and unicode objects in the unicode builds).Rather than overload the existing methods I want to make new ones so it
is clear that they are doing something different. The only problem at
this point is how to name them. My original thought was to just add a
"UTF8", like AddTextUTF8, but the problem with this is it won't be
practical to make it actually accept utf-8 text in the ansi build
because I won't know what encoding to convert it to (since Scintilla
won't be storing the data in utf-8 in this case.)So what I think I will do is make the new C++ methods be named like
AddTextBytes or AddTextRaw, and they will make no assumptions about
whether the data is utf-8 or some ansi encoding and will instead just
pass the raw bytes around. Then in the Python class we can add methods
like AddTextUTF8 that use wx.GetDefaultPyEncoding() to convert to/from
utf-8 and whatever is default encoding, if necessary.Could you tell us more about AddTextUTF8? What's its arguments?
Should it take a string encoded in local encoding as the argument?It is just like AddText except it expects its parameter to be a utf-8
encoded string. If running in a unicode build of Python (so wxSTC is
using utf-8 internally) then it just passes the string as-is to
AddTextRaw, otherwise it will try to convert to the default encoding first.Do you mean it's like this:
# if we use AddTextUTF8
stc.AddTextUTF8(s.decode('gb2312').encode('utf-8'))
Here is the actual function:
def AddTextUTF8(self, text):
"""
Add UTF8 encoded text to the document at the current position.
Works 'natively' in a unicode build of wxPython, and will also work
in an ansi build if the UTF8 text is compatible with the current
encoding.
"""
if not wx.USE_UNICODE:
u = text.decode('utf-8')
text = u.encode(wx.GetDefaultPyEncoding())
self.AddTextRaw(text)
but AddText/SetText may be more convenient:
stc.AddText(s.decode('gb2312'))
Yes. The UTF8 functions are mainly for people that want to manipulate
their documents in utf-8 strings from within their programs. If you
have strings in other encodings or unicode objects then using the
existing methods is the best.
ยทยทยท
--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!