wxPython.stc unicode cursor positioning bug

Hello,

There seems to be a bug in wxStyledTextControl (STC) displaying
composing characters in unicode. I have the following setup:

Windows XP, wxPython 2.4.2, Python 2.3

When I type composing Devanagari characters in the STC frame, the cursor
seems to be offset from the actual text input location. To elaborate: In
Devanagari (like many other scripts), certain characters which occupy
a single glyph are "composed" from multiple characters. This is where
STC seems to have a problem. When I type 'ka' in Devanagari, I get
a single glyph displayed on the frame as expected. However, the cursor
is positioned one more space to the right, as if the character had
occupied two glyphs... Seems like the cursor position in
wxStyledTextControl is calculated by simply counting the individual
characters without taking care of composition.

I hope this is the correct place to report this bug.

On a side note, wxTextCtrl (which is also great! :slight_smile: does _not_ have this
problem. The cursor is always positioned correctly. However, I would
like to use STC since it has builtin syntax highlighter for HTML.

Thanks,
Srinath

Srinath Avadhanula wrote:

Hello,

There seems to be a bug in wxStyledTextControl (STC) displaying
composing characters in unicode. I have the following setup:

Windows XP, wxPython 2.4.2, Python 2.3

Is it wxPython 2.4.2.4 or 2.4.2.4u ?

When I type composing Devanagari characters in the STC frame, the cursor
seems to be offset from the actual text input location. To elaborate: In
Devanagari (like many other scripts), certain characters which occupy
a single glyph are "composed" from multiple characters. This is where
STC seems to have a problem. When I type 'ka' in Devanagari, I get
a single glyph displayed on the frame as expected. However, the cursor
is positioned one more space to the right, as if the character had
occupied two glyphs... Seems like the cursor position in
wxStyledTextControl is calculated by simply counting the individual
characters without taking care of composition.

In the wxWindows UNICODE build wxSTC takes the utf-8 used internally by the Scinitlla control, converts it to unicode and uses the wxDC's GetTextExtent to do measurements of each character. In the non-UNICODE build the same thing happens without the utf-8/unicode convsersion since both use ascii. Nothing is done about checking for composed characters in either case. Even if we did something in wxSTC about it I don't know if wxDC::GetTextExtent could handle it...

I'll think a bit more about it and see if a solution pops out.

On a side note, wxTextCtrl (which is also great! :slight_smile: does _not_ have this
problem. The cursor is always positioned correctly.

Because it is using the native control in this case, not a custom one.

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

Thanks for replying...

> Windows XP, wxPython 2.4.2, Python 2.3

Is it wxPython 2.4.2.4 or 2.4.2.4u ?

Its 2.4.2.4u (should have mentioned this earlier)

In the wxWindows UNICODE build wxSTC takes the utf-8 used internally by
the Scinitlla control, converts it to unicode and uses the wxDC's
GetTextExtent to do measurements of each character. In the non-UNICODE
build the same thing happens without the utf-8/unicode convsersion since
both use ascii. Nothing is done about checking for composed characters
in either case. Even if we did something in wxSTC about it I don't know
if wxDC::GetTextExtent could handle it...

I'll think a bit more about it and see if a solution pops out.

That description seems to indicate that wxSTC seems to call the
wxDC::GetTextExtent() for each character typed instead of words. This
fails because in devanagari
    GetTextExtent(u'\u0915') + GetTextExtent(u'\u093f') !=
        GetTextExtent(u'\u0915\u093f')
From the documentation of wxDC.GetTextExtent, it looks like it is able
to handle strings rather than characters, so maybe when wxSTC gets a new
character, it should add that character to the current word and then
calculate the _change_ in the extent to calculate the effective extent
of the additional character...

The fact that u'\u0915' and u'\u093f' do get composed correctly by wxSTC
seems to me to indicate that the problem is not really severe.

(I am really completely unqualified to write about this, but its just
a thought.)

BTW, is wxSTC in wxPython a very thin wrapper around the wxWindows
thing, or is all of this python code somehwere?

Because it is using the native control in this case, not a custom one.

Okay.

Thanks,
Srinath

···

On Wed, 15 Oct 2003, Robin Dunn wrote:

Srinath Avadhanula wrote:

[...]

The fact that u'\u0915' and u'\u093f' do get composed correctly by wxSTC
seems to me to indicate that the problem is not really severe.

Thanks for the details. I need to readdress text measuring for the Mac anyway so I'll see shat I can do.

(I am really completely unqualified to write about this, but its just
a thought.)

BTW, is wxSTC in wxPython a very thin wrapper around the wxWindows
thing, or is all of this python code somehwere?

Thin Python wrapper, and the C++ code is a lumpy wrapper around Scintilla (thin in most places, with a few spots of thickness :wink:

···

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

Thanks for the reply Robin!

Thanks for the details. I need to readdress text measuring for the Mac
anyway so I'll see shat I can do.

I eagerly look forward to any news. While we are at it, here are some
feature requests :wink:

in wxSTC, we can set the caret width, but thats about is all we can do
to customize the look of the caret. Here are a few feature suggestions:

1. The caret is always in the middle of two characters. In many
   situations, especially in modal editors like vi(m), it is more
   sensible for the caret to be _on_ a character. Such situations are
   not limited to just in the over-write mode... It will be nice to be
   able to set the location of the caret programmatically. Something
   like,
     wxSTC.SetBlockCaret()
   This will make the caret be a transparent solid block covering one
   character.

2. It will also be very nice if it were possible to set a transparent
   mask for the caret in the situations when it is being drawn over a
   character (i.e in the block-caret mode above). This will be very
   useful in situations like showing a little right facing triangle in
   left-to-right editing and a back-ward facing right-to-left editing.

BTW, all these things are in a quest to make a "vi mode" for the
wxStyledTextCtrl. The caret kind of plays an important role in Vi and
provides lots of cues to the user about the mode he is currently in etc.
The reason I am trying to make a vi mode for wxSTC is that it seems to
handle proportional fonts and unicode characters very well. (Both these
are distant TODO's on the VIM design list).

> BTW, is wxSTC in wxPython a very thin wrapper around the wxWindows
> thing, or is all of this python code somehwere?
>

Thin Python wrapper, and the C++ code is a lumpy wrapper around
Scintilla (thin in most places, with a few spots of thickness :wink:

Okay! Then helping you out with this is kind of out of my league at this
point, then :slight_smile: Is it possible to fiddle around with wxSTC code without
knowing too much C++? Hmm... That questions seems kind of pointless.
Never mind.

Thanks,
Srinath

···

On Wed, 15 Oct 2003, Robin Dunn wrote:

Srinath Avadhanula wrote:

Thanks for the reply Robin!

Thanks for the details. I need to readdress text measuring for the Mac
anyway so I'll see shat I can do.

I eagerly look forward to any news. While we are at it, here are some
feature requests :wink:

in wxSTC, we can set the caret width, but thats about is all we can do
to customize the look of the caret.

And it's colour, and how often if flashes (if at all), and...

Here are a few feature suggestions:

1. The caret is always in the middle of two characters.

Not always. When you are in overtype mode instead of insert mode then the caret switches to an underbar. Maybe you could use that by switching to readonly and overtype when you are in the VI command mode, and then when the user does a command that goes into edit mode you can change the STC modes back...

In many
   situations, especially in modal editors like vi(m), it is more
   sensible for the caret to be _on_ a character. Such situations are
   not limited to just in the over-write mode... It will be nice to be
   able to set the location of the caret programmatically. Something
   like,
     wxSTC.SetBlockCaret()
   This will make the caret be a transparent solid block covering one
   character.

But this would be nice too, although it would have to be done in the Scintilla layers, not wxSTC.

2. It will also be very nice if it were possible to set a transparent
   mask for the caret in the situations when it is being drawn over a
   character (i.e in the block-caret mode above). This will be very
   useful in situations like showing a little right facing triangle in
   left-to-right editing and a back-ward facing right-to-left editing.

As would this.

···

On Wed, 15 Oct 2003, Robin Dunn wrote:

--
Robin Dunn
Software Craftsman
http://wxPython.org Java give you jitters? Relax with wxPython!

--- Srinath Avadhanula <srinath@fastmail.fm>
的正文:> Hello,

There seems to be a bug in wxStyledTextControl (STC)
displaying
composing characters in unicode. I have the
following setup:

Windows XP, wxPython 2.4.2, Python 2.3

my setup is the same

When I type composing Devanagari characters in the
STC frame, the cursor
seems to be offset from the actual text input
location.

However, the cursor
is positioned one more space to the right, as if the
character had
occupied two glyphs...

I have the same question when I type Chinese character
in the STC frame.A Chinese character consists of
mutiple characters too.
When I set the font style to 'Courier New',it's
better,no offset.But as I told you,A Chinese character
consists of mutiple characters,when I move the
cursor,it
is positioned in the middle of the character.
But wxTextCtrl does not have any problems.

···

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around