StyledTextCtrl.GetStyledText() returns random data?

I’ve been looking for a way to quickly get all the text and styles from a StyledTextCtrl.

The wxPython documentation says the GetCharacterPointer() method returns a MemoryBuffer object. I couldn’t find a description about the MemoryBuffer class but I noticed it has tolist() and tobytes() methods. When I call those methods I get the values I expect.

The wxPython documentation says the GetStyledText() method returns a MemoryBuffer of cells. The Scintilla documentation says it should use two bytes for each cell, with the character at the lower address of each pair and the style byte at the upper address.

When I tested this I did get twice as many bytes from GetStyledText() compared to GetCharacterPointer() however those bytes didn’t appear to bear any relation to the characters or styles in the STC. What’s more the values were different every time I ran the program!

Here is a simple program that just puts a single upper-case ‘A’ in an STC (without explicitly setting a style) and then calls the 2 methods:

import wx
import wx.stc as stc

class MyFrame(wx.Frame):
    def __init__(self, *args, **kwds):
        kwds["style"] = kwds.get("style", 0) | wx.DEFAULT_FRAME_STYLE
        wx.Frame.__init__(self, *args, **kwds)
        self.SetSize((400, 300))
        self.SetTitle("Test GetStyledText()")
        self.main_panel = wx.Panel(self, wx.ID_ANY)
        main_sizer = wx.BoxSizer(wx.VERTICAL)
        self.stc = stc.StyledTextCtrl(self.main_panel, wx.ID_ANY)
        main_sizer.Add(self.stc, 1, wx.EXPAND, 0)
        self.main_panel.SetSizer(main_sizer)
        self.Layout()

        self.stc.AddText("A")

        cp = self.stc.GetCharacterPointer()
        c_list = cp.tolist()
        print(len(c_list))
        print(c_list)
        print(cp.tobytes())
        print()

        st = self.stc.GetStyledText(0, self.stc.GetLastPosition())
        t_list = st.tolist()
        print(len(t_list))
        print(t_list)
        print(st.tobytes())
        print()

        style = self.stc.GetStyleAt(0)
        print(style)


class MyApp(wx.App):
    def OnInit(self):
        self.frame = MyFrame(None, wx.ID_ANY, "")
        self.SetTopWindow(self.frame)
        self.frame.Show()
        return True

if __name__ == "__main__":
    app = MyApp(0)
    app.MainLoop()

Here are the results from running it once:

1
[65]
b'A'

2
[80, 238]
b'P\xee'

0

Is this a bug, or am I not doing it right?

I am running on Python 3.8.10 + wxPython 4.1.1 gtk3 (phoenix) wxWidgets 3.1.5 + Linux Mint 20.2

I’ve noticed that with a longer string in the STC, some of the data returned by GetStyledText() is correct.
I modified the above code to insert the string ‘The quick brown fox’ and it output the following values:

19
[84, 104, 101, 32, 113, 117, 105, 99, 107, 32, 98, 114, 111, 119, 110, 32, 102, 111, 120]
b'The quick brown fox'

38
[240, 3, 222, 1, 0, 0, 0, 0, 16, 240, 79, 1, 0, 0, 0, 0, 107, 0, 32, 0, 98, 0, 114, 0, 111, 0, 119, 0, 110, 0, 32, 0, 102, 0, 111, 0, 120, 0]
b'\xf0\x03\xde\x01\x00\x00\x00\x00\x10\xf0O\x01\x00\x00\x00\x00k\x00 \x00b\x00r\x00o\x00w\x00n\x00 \x00f\x00o\x00x\x00'

The first 16 values from GetStyledText() were wrong, but the remainder (starting with 107) were correct.

Hi, RichardT

On my PC (Python 3.8.6 + wx 4.1.1) Windows 10, I got the correct results.

1
[65]
b'A'

2
[65, 0]
b'A\x00'

0

It seems like a bug that the head ptr is not initialized to zero on the Linux system.
What will happen if you use SetText, WriteText instead of AddText or calling ClearAll() at first?

Hi, Kazuya,

Thank you for your reply. I tried all your suggestions, but unfortunately they didn’t fix the problem.

This issue should be reported to wxWidgets. But I think GetStyledText is rarely needed. :slightly_smiling_face:

From your previous post, I guess you are trying to get the style number of the text.
FYI
The style number that GetStyledText returns is the number that the Lexer automatically assigns to the token. For example, if SetLexer(stc.STC_LEX_PYTHON) is used, one of the following 16 styles are assigned to each token:

wx.stc.STC_P_DEFAULT                        0
wx.stc.STC_P_COMMENTLINE                    1
wx.stc.STC_P_NUMBER                         2
wx.stc.STC_P_STRING                         3
wx.stc.STC_P_CHARACTER                      4
wx.stc.STC_P_WORD                           5
wx.stc.STC_P_TRIPLE                         6
wx.stc.STC_P_TRIPLEDOUBLE                   7
wx.stc.STC_P_CLASSNAME                      8
wx.stc.STC_P_DEFNAME                        9
wx.stc.STC_P_OPERATOR                       10
wx.stc.STC_P_IDENTIFIER                     11
wx.stc.STC_P_COMMENTBLOCK                   12
wx.stc.STC_P_STRINGEOL                      13
wx.stc.STC_P_WORD2                          14
wx.stc.STC_P_DECORATOR                      15

If you want to get the indicator number such as stc.STC_INDIC_SQUIGGLE, the following methods should be used:

IndicatorValueAt(indicator, pos) -> int
IndicatorStart(indicator, pos) -> int
IndicatorEnd(indicator, pos) -> int
IndicatorAllOnFor(pos) -> int

I have raised the issue with GetStyledText() on the wxPython Issue Tracker where it has been confirmed as reproducible.

As a workaround I have been experimenting with using GetText() and looping over the control’s contents and calling GetStyleAt(). This will be less efficient than using GetStyledText() but in most of my use cases the amount of text involved would be fairly small, so hopefully it should be OK.

    def getStyledText(self):
        """Return all the text and styles from the STC.

        :return: list of tuples [(char, style), ...] where
                  char is a one character str and style is an int.

        """
        text = self.GetText()

        styles = []
        for i in range(self.GetLastPosition()+1):
            style = self.GetStyleAt(i)
            styles.append(style)

        return list(zip(text, styles))

My objective is to provide a simple text editor that can be used to enter plain text and selectively apply a small number of predefined styles such as bold and italic. I do not want to have to use any form of markup in the text, so the use of a lexer would not be appropriate. The character + style data from the STC would be used to generate XML files (somewhat similar, but simpler than, those produced by the RichTextCtrl).

Thank you for opening the issue.

One advise
In stc, only lexers can give the style of font, size, and the colour of the text. So, you cannot do the same thing as richtext like select the text and assign styles, unless you would customize your own lexer.

(EDIT see discussions below)
But the additional decoration of text can be given by indicators (Scintilla Documentation), markers (Scintilla Documentation), etc.

I disagree. I have written several programs that apply simple styles to text in an stc without needing to use a lexer.

Here is a simple example that makes certain lines of output bold blue:

Consider another case where the user types some text into an stc. They select a word or phrase and click a button and the text is made bold. They click another button and the text plus styles of each character are stored in an XML file. At a later time they re-open the file and the text is inserted into the stc and the style information from the XML file is applied to each character. The data in the stc is now the same as when it was saved, without needing to use a lexer.

So nice!
In my use case, I wanted to highlight the searched words with a bold, yellow foreground, and red background. But I couldn’t find a way to do it and finally gave up… :worried:

Now, I’m using an indicator to do that but my complaint is that the color is half transparent and not so vibrant. So, I really appreciate it if you would give me a code snippet to achieve this! :star_struck:

I posted a sample code that uses stc indicater to make word clickable two months ago
(Notebook word make clickable - #12 by komoto48g).
The following is a bit modified version:

import wx
from wx import stc

class TestEditor(stc.StyledTextCtrl):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        
        self.StyleSetSpec(stc.STC_STYLE_DEFAULT,
                          "fore:#cccccc,back:#202020,face:MS Gothic,size:9")
        self.StyleClearAll()
        
        ## context-base style
        if wx.VERSION < (4,1,0):
            self.IndicatorSetStyle(0, stc.STC_INDIC_PLAIN)
            self.IndicatorSetStyle(1, stc.STC_INDIC_CONTAINER)
        else:
            self.IndicatorSetStyle(0, stc.STC_INDIC_TEXTFORE)
            self.IndicatorSetStyle(1, stc.STC_INDIC_ROUNDBOX)
        self.IndicatorSetForeground(0, "red")
        self.IndicatorSetForeground(1, "yellow")
        
        if wx.VERSION >= (4,1,0):
            self.IndicatorSetHoverStyle(1, stc.STC_INDIC_ROUNDBOX)
            self.IndicatorSetHoverForeground(1, "blue")
        
        self.Bind(stc.EVT_STC_INDICATOR_CLICK, self.OnIndicator)
    
    def OnIndicator(self, evt):
        if self.IndicatorValue == 1:
            p = self.IndicatorStart(1, evt.Position)
            q = self.IndicatorEnd(1, evt.Position)
            print(self.GetTextRange(p, q))
    
    def FilterText(self, text):
        if not text:
            for i in range(2):
                self.SetIndicatorCurrent(i)
                self.IndicatorClearRange(0, self.TextLength)
            return
        word = text.encode() # for multi-byte string
        raw = self.TextRaw # for multi-byte string
        lw = len(word)
        pos = -1
        while 1:
            pos = raw.find(word, pos+1)
            if pos < 0:
                break
            for i in range(2):
                self.SetIndicatorCurrent(i)
                self.IndicatorFillRange(pos, lw)

samplePhrase = """
She sells sea shells by the seashore.
The shells that she sells are sea shells I'm sure.
So if she sells sea shells on the seashore,
I'm sure that the shells are seashore shells.
"""

if __name__ == "__main__":
    app = wx.App()
    frame = wx.Frame(None)
    ed = TestEditor(frame)
    ed.Text = samplePhrase
    ed.FilterText("shells")
    frame.Show()
    app.MainLoop()

image I hope it will work on Linux.

Your code does run on Linux.
Here is a quick hack. It defines style 1 as bold yellow on red and applies that style to the found words in FilterText().

import wx
from wx import stc

class TestEditor(stc.StyledTextCtrl):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        self.StyleSetSpec(stc.STC_STYLE_DEFAULT,
                          "fore:#cccccc,back:#202020,face:MS Gothic,size:9")
        self.StyleClearAll()
        self.StyleSetSpec(1, "fore:#ffff00,back:#ff0000,bold,face:MS Gothic,size:9")


        ## context-base style
        if wx.VERSION < (4,1,0):
            self.IndicatorSetStyle(0, stc.STC_INDIC_PLAIN)
            self.IndicatorSetStyle(1, stc.STC_INDIC_CONTAINER)
        else:
            self.IndicatorSetStyle(0, stc.STC_INDIC_TEXTFORE)
            self.IndicatorSetStyle(1, stc.STC_INDIC_ROUNDBOX)
        self.IndicatorSetForeground(0, "red")
        self.IndicatorSetForeground(1, "yellow")

        if wx.VERSION >= (4,1,0):
            self.IndicatorSetHoverStyle(1, stc.STC_INDIC_ROUNDBOX)
            self.IndicatorSetHoverForeground(1, "blue")

        self.Bind(stc.EVT_STC_INDICATOR_CLICK, self.OnIndicator)

    def OnIndicator(self, evt):
        if self.IndicatorValue == 1:
            p = self.IndicatorStart(1, evt.Position)
            q = self.IndicatorEnd(1, evt.Position)
            print(self.GetTextRange(p, q))

    def FilterText(self, text):
        if not text:
            for i in range(2):
                self.SetIndicatorCurrent(i)
                self.IndicatorClearRange(0, self.TextLength)
            return
        word = text.encode() # for multi-byte string
        raw = self.TextRaw # for multi-byte string
        lw = len(word)
        pos = -1
        while 1:
            pos = raw.find(word, pos+1)
            if pos < 0:
                break
            # for i in range(2):
            #     self.SetIndicatorCurrent(i)
            #     self.IndicatorFillRange(pos, lw)
            self.StartStyling(pos)
            self.SetStyling(lw, 1)

samplePhrase = """
She sells sea shells by the seashore.
The shells that she sells are sea shells I'm sure.
So if she sells sea shells on the seashore,
I'm sure that the shells are seashore shells.
"""

if __name__ == "__main__":
    app = wx.App()
    frame = wx.Frame(None)
    ed = TestEditor(frame)
    ed.Text = samplePhrase
    ed.FilterText("shells")
    frame.Show()
    app.MainLoop()

Screenshot at 2021-12-21 13-52-06

Note: this simple version has lost the ability to click on the selected words.

1 Like

Thank you very much, Richard!
I understood that without lexer (or can I say default null lexer?), we can set the font style freely.
I also tried some tests and got that the available style must be in range(32).
Sorry for my misunderstanding. :sweat:

I also tried simpler but more practical code that uses python-lexer where I define a new style to assign to a searched word. Unfortunately, this code didn’t work perfectly.

import wx
from wx import stc

STC_P_WORD3 = 16

class TestEditor(stc.StyledTextCtrl):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        
        self.SetLexer(stc.STC_LEX_PYTHON)
        
        self.StyleSetSpec(stc.STC_STYLE_DEFAULT,
                          "fore:#cccccc,back:#202020,face:MS Gothic,size:9")
        self.StyleClearAll()
        self.StyleSetSpec(STC_P_WORD3,
                          "fore:#ffff00,back:#ff0000,bold,face:MS Gothic,size:9")

    def FilterText(self, text):
        if not text:
            return
        word = text.encode() # for multi-byte string
        raw = self.TextRaw # for multi-byte string
        lw = len(word)
        pos = -1
        while 1:
            pos = raw.find(word, pos+1)
            if pos < 0:
                break
            self.StartStyling(pos)
            self.SetStyling(lw, STC_P_WORD3)

samplePhrase = """
She sells sea shells by the seashore.
The shells that she sells are sea shells I'm sure.
So if she sells sea shells on the seashore,
I'm sure that the shells are seashore shells.
"""

if __name__ == "__main__":
    app = wx.App()
    frame = wx.Frame(None)
    ed = TestEditor(frame)
    ed.Text = samplePhrase
    ed.FilterText("shells")
    frame.Show()
    app.MainLoop()

Clipboard05+

There are seven words “shells”, but only three are highlighted.
The reason seems to be that the Python lexer resets the style after SetStyling.

I hope people who maintain stc will improve this so that users can add +15 more styles to python-lexer.
But I wonder where I should send the pull request, wxWidgets? Phoenix? Scintilla? or @Robin?

This would be a bit more efficient way. :slight_smile:

def getStyledText(self):
    return ((c, self.GetStyleAt(i)) for i, c in enumerate(self.Text))
1 Like

The StyledTextCtr is a very thin wrapper around Scintilla, it basically just adds some wx flavor to the Scintilla classes. So changes to the internal functionality have to be done by the Scintilla project.

Thank you very much, Robin.
I found an old ticket (Scintilla / Feature Requests / #998 8 Python Keyword Styles), but no progress seems to be seen yet. Instead, in the latest SciTE 5.1.6/Scintilla:5.1.5, there are a few additional styles for f-string.

# F-String
style.python.16=
# Single quoted f-string
style.python.17=
# Triple quoted f-string
style.python.18=
# Triple double quoted f-string
style.python.19=

I have noticed a problem with my idea of programmatically applying styles in a StyledTextCtrl relating to the Undo/Redo functionality. The built-in Undo/Redo does not appear to track changes of styles. Presumably this is because it expects the lexer to take care of it. I had not noticed this before because all of my previous use of the technique involved the display of progress/logging information that was read-only and not user editable.

So, for my current project I have gone back to trying to use a RichTextCtrl instead of a StyledTextCtrl. Of course that still leaves the problem of the waved underline not working on GTK, as described here.

1 Like